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PREFACE  TO  THE  FOURTH  EDITION 

The  main  change  from  the  third  edition  is  that  the  chapter  on  quantum 
electrodynamics  has  been  rewritten.  The  quantum  electrodynamics 

nartkl111  ^ rIlt,0n  deSCribes  the  motion  of  individual  charged 
partmles  moving  through  the  electromagnetic  field,  in  close  analogy 

with  classical  electrodynamics.  It  is  a form  of  theory  in  which  the 
number  of  charged  particles  is  conserved  and  it  cannot  be  generalized 
allow  of  variation  of  the  number  of  charged  particles 

JZZl;?7!liSh-en7gy  PhySiCS  the  Creation  and  annihilation 
chafed  particles  is  a frequent  occurrence.  A quantum  electro- 
dynamics which  demands  conservation  of  the  number  of  charged 

reiil -1C  ri?  erefore  out  of  touch  with  physical  reality.  So  I have 

am  ihfiatfon  of  eT1?41"11  e'eCtrodynamics  which  include,  creation  and 
annihilation  of  electron-positron  pairs.  This  involves  abandoning  any 

close  analogy  with  classical  electron  theory,  but  provides  a closer 

description  of  nature.  It  seems  that  the  classical  concept  of  an  electron 

theo°r  1 m0del  m Phy'SiCS’  excel,t  P°ssil>ly  for  elementary 

theories  that  are  restricted  to  low-energy  phenomena  J 

ST.  JOHN’S  COLLEGE,  CAMBRIDGE  P‘  A‘  M 'D‘ 

11  May  1957 


NOTE  TO  THE  REVISION  OF  THE 
FOURTH  EDITION 

The  opportunity  has  been  taken  of  revising  parts  of  Chapter  XII 
( Quantum  electrodynamics’)  and  of  adding  two  new  sections  on 
interpretation  and  applications.  P A M D 

st.  John’s  college,  Cambridge 
26  May  1 967 


FROM  THE 

PREFACE  TO  THE  FIRST  EDITION 

The  methods  of  progress  in  theoretical  physics  have  undergone  a 
vast  change  during  the  present  century.  The  classical  tradition 
has  been  to  consider  the  world  to  be  an  association  of  observable 
objects  (particles,  fluids,  fields,  etc.)  moving  about  according  to 
definite  laws  of  force,  so  that  one  could  form  a mental  picture  in 
space  and  time  of  the  whole  scheme.  This  led  to  a physics  whose  aim 
was  to  make  assumptions  about  the  mechanism  and  forces  connecting 
these  observable  objects,  to  account  for  their  behaviour  in  the 
simplest  possible  way.  It  has  become  increasingly  evident  in  recent 
times,  however,  that  nature  works  on  a different  plan.  Her  funda- 
mental laws  do  not  govern  the  world  as  it  appears  in  our  mental 
picture  in  any  very  direct  way,  but  instead  they  control  a substra- 
tum of  which  we  cannot  form  a mental  picture  without  intro- 
ducing irrelevancies.  The  formulation  of  these  laws  requires  the  use 
of  the  mathematics  of  transformations.  The  important  things  in 
the  world  appear  as  the  invariants  (or  more  generally  the  nearly 
invariants,  or  quantities  with  simple  transformation  properties) 
of  these  transformations.  The  things  we  are  immediately  aware  of 
are  the  relations  of  these  nearly  invariants  to  a certain  frame  of 
reference,  usually  one  chosen  so  as  to  introduce  special  simplifying 
features  which  are  unimportant  from  the  point. of  view  of  general 
theory. 

The  growth  of  the  use  of  transformation  theory,  as  applied  first  to 
relativity  and  later  to  the  quantum  theory,  is  the  essence  of  the  new 
method  in  theoretical  physics.  Further  progress  lies  in  thq  direction 
of  making  our  equations  invariant  under  wider  and  still  wider  trans- 
formations. This  state  of  affairs  is  very  satisfactory  from  a philo- 
sophical point  of  view,  as  implying  an  increasing  recognition  of  the 
part  played  by  the  observer  in  lu'mself  introducing  the  regularities 
that  appear  in  his  observations,  and  a lack  of  arbitrariness  in  the  wavs 
of  nature,  but  it  makes  tilings  less  easy  for  the  learner  of  physics. 
The  new  theories,  if  one  looks  apart  from  their  mathematical  setting, 
are  built  up  from  physical  concepts  which  cannot  be  explained  in 
terms  of  things  previously  known  to  the  student,  which  cannot  even 
be  explained  adequately  in  words  at  all.  Like  the  fundamental  con- 
cepts (e.g.  proximity,  identity)  which  every  one  must  learn  on  his 


PREFACE  TO  FIRST  EDITION 


viii 

arrival  into  the  world,  the  newer  concepts  of  physics  can  be  mastered 
only  by  long  familiarity  with  their  properties  and  uses. 

From  the  mathematical  side  the  approach  to  the  new  theories 
presents  no  difficulties,  as  the  mathematics  required  (at  any  rate  that 
which  is  required  for  the  development  of  physics  up  to  the  present) 
is  not  essentially  different  from  what  has  been  current  for  a consider- 
able time.  Mathematics  is  the  tool  specially  suited  for  dealing  with 
abstract  concepts  of  any  kind  and  there  is  no  limit  to  its  power  in  this 
field.  For  this  reason  a book  on  the  new  physics,  if  not  purely  descrip- 
tive of  experimental  work,  must  be  essentially  mathematical.  All  the 
same  the  mathematics  is  only  a tool  and  one  should  learn  to  hold  the 
physical  ideas  in  one’s  mind  without  reference  to  the  mathematical 
form.  In  this  book  I have  tried  to  keep  the  physics  to  the  forefront, 
by  beginning  with  an  entirely  physical  chapter  and  in  the  later  work 
examining  the  physical  meaning  underlying  the  formalism  wherever 
possible.  The  amount  of  theoretical  ground  one  has  to  cover  before 
being  able  to  solve  problems  of  real  practical  value  is  rather  large,  but 
this  circumstance  is  an  inevitable  consequence  of  the  fundamental 
part  played  by  transformation  theory  and  is  likely  to  become  more 
pronounced  in  the  theoretical  physics  of  the  future. 

With  regard  to  the  mathematical  form  in  which  the  theory  can  be 
presented,  an  author  must  decide  at  the  outset  between  two  methods. 
There  is  the  symbolic  method,  which  deals  directly  in  an  abstract  way 
with  the  quantities  of  fundamental  importance  (the  invariants,  etc., 
of  the  transformations)  and  there  is  the  method  of  coordinates  or 
representations,  which  deals  with  sets  of  numbers  corresponding  to 
these  quantities.  The  second  of  these  has  usually  been  used  for  the 
presentation  of  quantum  mechanics  (in  fact  it  has  been  used  practi- 
cally exclusively  with  the  exception  of  Weyl’s  book  Gruppentheorie 
und  Quantenmechajiik).  It  is  known  under  one  or  other  of  the  tw  o 
names  ‘Wave  Mechanics ’ and  ‘Matrix  Mechanics ’ according  to  which 
physical  tilings  receive  emphasis  in  the  treatment,  the  states  ot  a 
system  or  its  dynamical  variables.  It  has  the  advantage  that  the  kind 
of  mathematics  required  is  more  familiar  to  the  average  student  , and 
also  it  is  the  historical  method. 

The  symbolic  method,  however,  seems  to  go  more  deeply  into  the 
nature  of  things.  It  enables  one  to  exnress  the  physical  laws  in  a neat 
and  concise  way,  and  will  probably  be  increasingly  used  in  the  future 
as  it  becomes  better  understood  and  its  own  special  mathematics  gets 
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developed.  For  this  reason  I have  chosen  the  symbolic  method, 
introducing  the  representatives  later  merely  as  an  aid  to  practical 
calculation.  Tliis  has  necessitated  a complete  break  from  the  histori- 
cal line  of  development,  but  this  break  is  an  advantage  through 
~^fcsabling  the  approach  to  the  new  ideas  to  be  made  as  direct  as 
possible. 


P.  A.  M.  D. 


ST.  JOHN  S COLLEGE,  CAMBRIDGE 


29  May  1930 
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I 

THE  PRINCIPLE  OF  SUPERPOSITION 
1.  The  need  for  a quantum  theory 

Classical  mechanics  has  been  developed  continuously  from  the  time 
of  Newton  and  applied  to  an  ever-widening  range  of  dynamical 
systems,  including  the  electromagnetic  field  in  interaction  with 
matter.  The  underlying  ideas  and  the  laws  governing  their  applica- 
tion form  a simple  and  elegant  scheme,  which  one  would  be  inclined 
to  think  could  not  be  seriously  modified  without  having  all  its 
attractive  features  spoilt.  Nevertheless  it  has  been  found  possible  to 
set  up  a new  scheme,  called  quantum  mechanics,  which  is  more 
suitable  for  the  description  of  phenomena  on  the  atomic  scale  and 
which  is  in  some  respects  more  elegant  and  satisfying  than  the 
classical  scheme.  This  possibility  is  due  to  the  changes  which  the 
new  scheme  involves  being  of  a very  profound  character  and  not 
clashing  with  the  features  of  the  classical  theory  that  make  it  so 
attractive,  as  a result  of  which  all  these  features  can  be  incorporated 
in  the  new  scheme. 

The  necessity  for  a departure  from  classical  mechanics  is  clearly 
shown  by  experimental  results.  In  the  first  place  the  forces  known 
in  classical  electrodynamics  are  inadequate  for  the  explanation  of  the 
remarkable  stability  of  atoms  and  molecules,  which  is  necessary  in 
order  that  materials  may  have  any  definite  physical  and  chemical 
properties  at  all.  The  introduction  of  new  hypothetical  forces  will  not 
save  the  situation,  since  there  exist  general  principles  of  classical 
mechanics,  holding  for  all  kinds  of  forces,  leading  to  results  in  direct 
disagreement  with  observation.  For  example,  if  an  atomic  system  has 
its  equilibrium  disturbed  in  any  way  and  is  then  left  alone,  it  will  be  set 
in  oscillation  and  the  oscillations  will  get  impressed  on  the  surround- 
ing electromagnetic  field,  so  that  their  frequencies  may  be  observed 
with  a spectroscope.  Now  whatever  the  laws  of  force  governing  the 
equilibrium,  one  would  expect  to  be  able  to  include  the  various  fre- 
quencies in  a scheme  comprising  certain  fundamental  frequencies  and 
their  harmonics.  This  is  not  observed  to  be  the  case.  Instead,  there 
is  observed  a new  and  unexpected  connexion  between  the  frequencies, 
called  Ritz’s  Combination  Law  of  Spectroscopy,  according  to  which  all 
the  frequencies  can  be  expressed  as  differences  between  certain  terms, 
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the  number  of  terms  being  much  less  than  the  number  of  frequencies. 
This  law  is  quite  ^unintelligible  from  the  classical  standpoint. 

One  might  try  to  get  over  the  difficulty  without  departing  from 
classical  mechanics  by  assuming  each  of  the  spectroscopically  ob- 
served frequencies  to  be  a fundamental  frequency  with  its  own  degree 
of  freedom,  the  laws  of  force  being  such  that  the  harmonic  vibrations 
do  not  occur.  Such  a theory  will  not  do,  however,  even  apart  from 
the  fact  that  it  would  give  no  explanation  of  the  Combination  Law, 
since  it  would  immediately  bring  one  into  conflict  with  the  experi- 
mental evidence  on  specific  heats.  Classical  statistical  mechanics 
enables  one  to  establish  a general  connexion  between  the  total  number 
of  degrees  of  freedom  of  an  assembly  of  vibrating  systems  and  its 
specific  heat.  If  one  assumes  all  the  spectroscopic  frequencies  of  an 
atom  to  correspond  to  different  degrees  of  freedom,  one  would  get  a 
specific  heat  for  any  kind  of  matter  very  much  greater  than  the 
observed  value.  In  fact  the  observed  specific  heats  at  ordinary 
temperatures  are  given  fairly  well  by  a theory  that  takes  into  account 
merely  the  motion  of  each  atom  as  a whole  and  assigns  no  internal 
motion  to  it  at  all. 

This  leads  us  to  a new  clash  between  classical  mechanics  and  the 
results  of  experiment.  There  must  certainly  be  some  internal  motion 
in  an  atom  to  account  for  its  spectrum,  but  the  interhal  degrees  of 
freedom,  for  some  classically  inexplicable  reason,  do  not  contribute 
to  the  specific  heat.  A similar  clash  is  found  in  connexion  with  the 
energy  of  oscillation  of  the  electromagnetic  field  in  a vacuum.  Classical 
mechanics  squires  the  specific  heat  corresponding  to  this  energy  to 
be  infinite,  but  it  is  observed  to  be  quite  finite.  A general  conclusion 
from  experimental  results  is  that  oscillations  of  high  frequency  do 
not  contribute  their  classical  quota  to  the  specific  heat. 

As  another  illustration  of  the  failure  of  classical  mechanics  we  may 
consider  the  behaviour  of  fight.  We  have,  on  the  one  hand,  the 
phenomena  of  interference  and  diffraction,  which  can  be  explained 
only  on  the  basis  of  a wave  theory;  on  the  other,  phenomena  such  as 
photo-electric  emission  and  scattering  by  free  electrons,  which  show 
that  fight  is  composed  of  small  particles.  These  particles,  which 
are  called  photons,  have  each  a definite  energy  and  momentum,  de- 
pending on  the  frequency  of  the  fight,  and  appear  to  have  just  as 
real  an  existence  as  electrons,  or  any  other  particles  known  in  physics. 
A fraction  of  a photon  is  never  observed. 
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Experiments  have  shown  that  this  anomalous  behaviour  is  not 
peculiar  to  light,  but  is  quite  general.  All  material  particles  have 
wave  properties,  which  can  be  exhibited  under  suitable  conditions. 
We  have  here  a very  striking  and  general  example  of  the  breakdown 
of  classical  mechanics — not  merely  an  inaccuracy  in  its  laws  of  motion, 
but  an  inadequacy  of  its  concepts  to  supply  us  with  a description  of 
atomic  events. 

The  necessity  to  depart  from  classical  ideas  when  one  wishes  to 
account  for  the  ultimate  structure  of  matter  may  be  seen,  not  only 
from  experimentally  established  facts,  but  also  from  general  philo- 
sophical grounds.  In  a classical  explanation  of  the  constitution  of 
matter,  one  would  assume  it  to  be  made  up  of  a large  number  of  small 
constituent  parts  and  one  would  postulate  laws  for  the  behaviour  of 
these  parts,  from  which  the  laws  of  the  matter  in  bulk  could  be  de- 
duced. This  would  not  complete  the  explanation,  however,  since  the 
question  of  the  structure  and  stability  of  the  constituent  parts  is  left 
untouched.  To  go  into  this  question,  it  becomes  necessary  to  postu- 
late that  each  constituent  part  is  itself  made  up  of  smaller  parts,  in 
terms  of  which  its  behaviour  is  to  be  explained.  There  is  clearly  no 
end  to  this  procedure,  so  that  one  can  never  arrive  at  the  ultimate 
structure  of  matter  on  these  lines.  So  long  as  big  and  small  are  merely 
relative  concepts,  it  is  no  help  to  explain  the  big  in  terms  of  the  small. 
It  is  therefore  necessary  to  modify  classical  ideas  in  such  a way  as  to 
give  an  absolute  meaning  to  size. 

At  this  stage  it  becomes  important  to  remember  that  science  is 
concerned  only  with  observable  things  and  that  we  can  observe  an 
object  only  by  letting  it  interact  with  some  outside  influence.  An  act 
of  observation  is  thus  necessarily  accompanied  by  some  disturbance 
of  the  object  observed.  We  may  define  an  object  to  be  big  when  the 
disturbance  accompanying  our  observation  of  it  may  be  neglected, 
and  small  when  the  disturbance  cannot  be  neglected.  This  definition 
is  in  close  agreement  with  the  common  meanings  of  big  and  small. 

It  is  usually  assumed  that,  by  being  careful,  we  may  cut  down  the 
disturbance  accompanying  our  observation  to  any  desired  extent. 
The  concepts  of  big  and  small  are  then  purely  relative  and  refer  to  the 
gentleness  of  our  means  of  observation  as  well  as  to  the  object  being 
described.  In  order  to  give  an  absolute  meaning  to  size,  such  as  is 
required  for  any  theory  of  the  ultimate  structure  of  matter,  we  have 
to  assume  that  there  is  a limit  to  the  fineness  of  our  poivers  of  observation 
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and  the  smallness  of  the  accompanying  disturbance — a limit  which  is 
inherent  in  the  nature  of  things  and  can  never  be  surpassed  by  improved 
technique  or  increased  skill  on  the  part  of  the  observer.  If  the  object  under 
observation  is  such  that  the  unavoidable  limiting  disturbance  is  negli- 
gible, then  the  object  is  big  in  the  absolute  sense  and  we  may  apply 
classical  mechanics  to  it.  If,  on  the  other  hand,  the  limiting  dis- 
turbance is  not  negligible,  then  the  object  is  small  in  the  absolute 
sense  and  we  require  a new  theory  for  dealing  with  it. 

A consequence  of  the  preceding  discussion  is  that  we  must  revise 
our  ideas  of  causality.  Causality  applies  only  to  a system  which  is 
left  undisturbed.  If  a system  is  small,  we  cannot  observe  it  without 
producing  a serious  disturbance  and  hence  we  cannot  expect  to  find 
any  causal  connexion  between  the  results  of  our  observations. 
Causality  will  still  be  assumed  to  apply  to  undisturbed  systems  and 
the  equations  which  will  be  set  up  to  describe  an  undisturbed  system 
will  be  differential  equations  expressing  a causal  connexion  between 
conditions  at  one  time  and  conditions  at  a later  time.  These  equations 
will  be  in  close  correspondence  with  the  equations  of  classical 
mechanics,  but  they  will  be  connected  only  indirectly  with  the  results 
of  observations.  There  is  an  unavoidable  indeterminacy  in  the  calcu- 
lation of  observational  results,  the  theory  enabling  us  to  calculate  in 
general  only  the  probability  of  our  obtaining  a particular  result  when 
we  make  an  observation. 


2.  The  polarization  of  photons  . > 

The  discussion  in  the  preceding  section  about  the  limit  to  the 
gentleness  with  which  observations  can  be  made  and  the  consequent 
indeterminacy  in  the  results  of  those  observations  does  not  provide 
any  quantitative  basis  for  the  building  up  of  quantum  mechanics. 
For  this  purpose  a new  set  of  accurate  laws  of  nature  is  required. 
One  of  the  most  fundamental  and  most  drastic  of  these  is  the  Principle 
of  Superposition  of  States.  We  shall  lead  up  to  a general  formulation 
of  this  principle  through  a considerat  ion  of  some  special  cases,  taking 
first  the  example  provided  bv  the  polarization  of  light. 

It  is  known  experimentally  that  when  plane-polarized  light  is  used 
for  ejecting  photo -electrons,  there  is  a preferential  direction  for  the 
electron  emission.  Thus  the  polarization  properties  of  light  are  closely 
connected  with  its  corpuscular  properties  and  one  must  ascribe  a 
polarization  to  the  photons.  One  must  consider,  for  instance,  a beam 
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of  light  plane-polarized  in  a certain  direction  as  consisting  of  photons 
each  of  which  is  plane -polarized  in  that  direction  and  a beam  of 
circularly  polarized  light  as  consisting  of  photorTs  each  circularly 
polarized.  Every  photon  is  in  a certain  state  of  polarization,  as  we 
shall  say.  The  problem  we  must  now  consider  is  how  to  fit  in  these 
ideas  with  the  known  facts  about  the  resolution  of  light  into  polarized 
components  and  the  recombination  of  these  components. 

Let  us  take  a definite  case.  Suppose  we  have  a beam  of  light  passing 
through  a crystal  of  tourmaline,  which  has  the  property  of  letting 
through  only  light  plane-polarized  perpendicular  to  its  optic  axis. 
Classical  electrodynamics  tells  us  what  will  happen  for  any  given 
polarization  of  the  incident  beam.  If  this  beam  is  polarized  per- 
pendicular to  the  optic  axis,  it  wifi  all  go  through  the  crystal;  if 
parallel  to  the  axis,  none  of  it  will  go  through;  while  if  polarized  at 
an  angle  a to  the  axis,  a fraction  sin2a  will  go  through.  How  are  we 
to  understand  these  results  on  a photon  basis  ? 

A beam  that  is  plane-polarized  in  a certain  direction  is  to  be 
pictured  as  made  up  of  photons  each  plane -polarized  in  that 
direction.  This  picture  leads  to  no  difficulty  in  the  cases  when  our 
incident  beam  is  polarized  perpendicular  or  parallel  to  the  optic  axis. 
We  merely  have  to  suppose  that  each  photon  polarized  perpendicular 
to  the  axis  passes  unhindered  and  unchanged  through  the  crystal, 
while  each  photon  polarized  parallel  to  the  axis  is  stopped  and  ab- 
sorbed. A difficulty  arises,  however,  in  the  case  of  the  obliquely 
polarized  incident  beam.  Each  of  the  incident  photons  is  then 
obliquely  polarized  and  it  is  not  clear  what  will  happen  to  such  a 
photon  when  it  reaches  the  tourmaline. 

A question  about  what  will  happen  to  a particular  photon  under 
certain  conditions  is  not  really  very  precise.  To  make  it  precise  one 
must  imagine  some  experiment  performed  having  a bearing  on  the 
question  and  inquire  what  will  be  the  result  of  the  experiment.  Only 
questions  about  the  results  of  experiments  have  a real  significance 
and  it  is  only  such  questions  that  theoretical  physics  has  to  consider. 

In  our  present  example  the  obvious  experiment  is  to  use  an  incident 
beam  consisting  of  only  a single  photon  and  to  observe  what  appears 
on  the  back  side  of  the  crystal.  According  to  quantum  mechanics 
the  result  of  this  experiment  will  be  that  sometimes  one  will  find  a 
whole  photon,  of  energy  equal  to  the  energy  of  the  incident  photon, 
on  the  back  side  and  other  times  one  will  find  nothing.  When  one 
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finds  a whole  photon,  it  will  be  polarized  perpendicular  to  the  optic 
axis.  One  will  never  find  only  a part  of  a photon  on  the  back  side. 
If  one  repeats  the  experiment  a large  number  of  times,  one  will  find 
the  photon  on  the  back  side  in  a fraction  sin2a  of  the  total  number 
of  times.  Thus  we  may  say  that  the  photon  has  a probability  sin2a 
of  passing  through  the  tourmaline  and  appearing  on  the  back  side 
polarized  perpendicular  to  the  axis  and  a probability  cos2<*  of  being 
absorbed.  These  values  for  the  probabilities  lead  to  the  correct 
classical  results  for  an  incident  beam  containing  a large  number  of 
photons. 

In  this  way  we  preserve  the  individuality  of  the  photon  in  all 
cases.  We  are  able  to  do  this,  however,  only  because  we  abandon  the 
determinacy  of  the  classical  theory.  The  result  of  an  experiment  is 
not  determined,  as  it  would  be  according  to  classical  ideas,  by  the 
conditions  under  the  control  of  the  experimenter.  The  most  that  can 
be  predicted  is  a set  of  possible  results,  with  a probability  of  occur- 
rence for  each. 

The  foregoing  discussion  about  the  result  of  an  experiment  with  a 
single  obliquely  polarized  photon  incident  on  a crystal  of  tourmaline 
answers  all  that  can  legitimately  be  asked  about  what  happens  to  an 
obliquely  polarized  photon  when  it  reaches  the  tourmaline.  Questions 
about  what  decides  whether  the  photon  is  to  go  through  or  not  and 
how  it  changes  its  direction  of  polarization  when  it  does  go  through 
cannot  be  investigated  by  experiment  and  should  be  regarded  as 
outside  the  domain  of  science.  Nevertheless  some  further  description 
is  necessary  in  order  to  correlate  the  results  of  this  experiment  with 
the  results  of  other  experiments  that  might  be  performed  with 
photons  and  to  fit  them  all  into  a general  scheme.  Such  further 
description  should  be  regarded,  not  as  an  attempt  to  answer  questions 
outside  the  domain  of  science,  but  as  an  aid  ro  the  formulation  of 
rules  for  expressing  concisely  the  results  of  large  numbers  of  experi- 
ments. 

The  further  description  provided  by  quantum  mechanics  runs  as 
follows.  It  is  supposed  that  a photon  polarized  obliquely  to  the  optic 
axis  may  be  regarded  as  being  partly  in  the  state  of  polarization 
parallel  to  the  axis  and  partly  in  the  state  of  polarization  perpen- 
dicular to  the  axis.  The  state  of  oblique  polarization  may  be  con- 
sidered as  the  result  of  some  kind  of  superposition  process  applied  to 
the  two  states  of  parallel  and  perpendicular  polarization.  This  implies 
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a certain  special  kind  of  relationship  between  the  various  states  of 
polarization,  a relationship  similar  to  that  between  polarized  beams  in 
classical  optics,  but  which  is  now  to  be  applied,  not  to  beams,  but  to 
the  states  of  polarization  of  one  particular  photon.  This  relationship 
allows  any  state  of  polarization  to  be  resolved  into,  or  expressed  as  a 
superposition  of,  any  two  mutually  perpendicular  states  of  polari- 
zation. 

When  we  make  the  photon  meet  a tourmaline  crystal,  we  are  sub- 
jecting it  to  an  observation.  We  are  observing  whether  it  is  polarized 
parallel  or  perpendicular  to  the  optic  axis.  The  effect  of  making  this 
observation  is  to  force  the  photon  entirely  into  the  state  of  parallel 
or  entirely  into  the  state  of  perpendicular  polarization.  It  has  to 
make  a sudden  jump  from  being  partly  in  each  of  these  two  states  to 
being  entirely  in  one  or  other  of  them.  Which  of  the  two  states  it  will 
jump  into  cannot  be  predicted,  but  is  governed  only  by  probability 
laws.  If  it  jumps  into  the  parallel  state  it  gets  absorbed  and  if  it 
jumps  into  the  perpendicular  state  it  passes  through  the  crystal  and 
appears  on  the  other  side  preserving  this  state  of  polarization. 

3.  Interference  of  photons 

In  this  section  we  shall  deal  with  another  example  of  superposition. 
We  shall  again  take  photons,  but  shall  be  concerned  with  their  posi- 
tion in  space  and  their  momentum  instead  of  their  polarization.  If 
we  are  given  a beam  of  roughly  monochromatic  light,  then  we  know 
something  about  the  location  and  momentum  of  the  associated 
photons.  We  know  that  each  of  them  is  located  somewhere  in  the 
region  of  space  through  which  the  beam  is  passing  and  lias  a momen- 
tum in  the  direction  of  the  beam  of  magnitude  given  in  terms  of  the 
frequency  of  the  beam  by  Einstein’s  photo-electric  law — momentum 
equals  frequency  multiplied  by  a universal  constant.  When  we  have 
such  information  about  the  location  and  momentum  of  a photon  we 
shall  say  that  it  is  in  a definite  translational  state. 

We  shall  discuss  the  description  which  quantum  mechanics  pro- 
vides of  the  interference  of  photons.  Let  us  take  a definite  experi- 
ment demonstrating  interference.  Suppose  we  have  a beam  of  light 
which  is  passed  through  some  kind  of  interferometer,  so  that  it  gets 
split  up  into  two  components  and  the  two  components  are  subse- 
quently made  to  interfere.  We  may,  as  in  the  preceding  section,  take 
an  incident  beam  consisting  of  only  a single  photon  and  inquire  what 
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Will  happen  to  it  as  it  goes  through  the  apparatus.  This  will  present 

to  us  the  difficulty  of  the  conflict  between  the  wave  and  corpuscular 
theories  of  hght  in  an  acute  form.  ^ 

Corresponding  to  the  description  that  we  had  in  the  case  of  the 
polarization,  we  must  now  describe  the  photon  as  going  partly  into 
each  of  the  two  components  into  which  the  incident  beam  is  split 
J he  photon  is  then,  as  we  may  say , in  a translational  state  given  by  the 
superposition  of  the  two  translational  states  associated  with  the  two 
components.  We  are  thus  led  to  a generalization  of  the  term  ‘trans- 
lational state’  applied  to  a photon.  For  a photon  to  be  in  a definite 
translational  state  it  need  not  be  associated  with  one  single  beam  of 
light  but  may  be  associated  with  two  or  more  beams  of  light  which 
are  the  components  into  which  one  original  beam  has  been  split,  f In 
t e accurate  mathematical  theory  each  translational  state  is  associated 
with  one  of  the  wave  functions  of  ordinary  wave  optics,  which  wave 
function  may  describe  either  a single  beam  or  two  or  more  beams 
“t°  whlch  one  ^ginal  beam  has  been  split.  Translational  states  are 
hus  superposable  in  a similar  way  to  wave  functions. 

Let  us  consider  now  what  happens  when  we  determine  the  energy 
m one  of  the  components.  The  result  of  such  a determination  must 
be  either  the  whole  photon  or  nothing  at  all.  Thus  the  photon  must 
change  suddenly  from  being  partly  in  one  beam  and  partly  in  the 
other  to  being  entirely  in  one  of  the  beams.  This  sudden  change  is 
due  to  the  disturbance  in  the  translational  state  of  the  photon  which 
the  observation  necessarily  makes.  It  is  impossible  to  predict  in  which  ' 
of  the  two  beams  the  photon  will  be  found.  Only  the  probability  of 
ei  her  result  can  be  calculated  from  the  previous  distribution  of  the 
photon  over  the  two  beams. 

One  could  carry  out  the  energy  measurement  without  destroying  the 
component  beam  by,  for  example,  reflecting  the  beam  from  a movable 
mirror  and  observing  the  recoil.  Our  description  of  the  photon  allows 
us  to  infer  that,  after  such  an  energy  measurement,  it  would  not  be 
possible  to  bring  about  any  interference  effects  between  the  two  com- 
ponents. So  long  as  the  photon  is  partly  in  one  beam  and  partly  in 
the  other,  interference  can  occur  when  the  two  beams  are  superposed, 
but  this  possibility  disappears  when  the  photon  is  forced  entirely  into 

t The  circumstance  that  the  superposition  idea  requires  us  to  generalize  our 
!ginal  meaning  of  translational  states,  but  that  no  corresponding  generalization  was 
needed  for  the  states  of  polarization  of  the  preceding  section,  if  fn  accidental  one 
with  no  underlying  theoretical  significance.  accidental  one 
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one  of  the  beams  by  an  observation.  The  other  beam  then  no  longer 
enters  into  the  description  of  the  photon,  so  that  it  counts  as  being 
entirely  in  the  one  beam  in  the  ordinary  way  for  any  experiment  that 
may  subsequently  be  performed  on  it. 

On  these  lines  quantum  mechanics  is  able  to  effect  a reconciliation 
of  the  wave  and  corpuscular  properties  of  light.  The  essential  point 
is  the  association  of  each  of  the  translational  states  of  a photon  with 
one  of  the  wave  functions  of  ordinary  wave  optics.  The  nature  of  this 
association  cannot  be  pictured  on  a basis  of  classical  mechanics,  but 
is  something  entirely  new.  It  would  be  quite  wrong  to  picture  the 
photon  and  its  associated  wave  as  interacting  in  the  way  in  which 
particles  and  waves  can  interact  in  classical  mechanics.  The  associa- 
tion can  be  interpreted  only  statistically,  the  wave  function  giving 
I us  information  about  the  probability  of  our  finding  the  photon  in  any 
i particular  place  when  we  make  an  observation  of  where  it  is. 

— . Some  time  before  the  discovery  of  quantum  mechanics  people  ^ 
idealized  that  the  connexion  between  light  waves  and  photons  must 
be  of  a statistical  character.  What  they  did  not  clearly  realize,  how- 
, ever,  was  that  the  wave  function  gives  information  about  the  proba- 
bility of  one  photon  being  in  a particular  place  and  not  the  probable 
number  of  photons  in  that  place.  The  importance  of  the  distinction 
can  be  made  clear  in  the  following  way.  Suppose  we  have  a beam 
of  light  consisting  of  a large  number  of  photons  split  up  into  two  com- 
ponents of  equal  intensity.  On  the  assumption  that  the  intensity  of 
a beam  is  connected  with  the  probable  number  of  photons  in  it,  we 
should  have  half  the  total  number  of  photons  going  into  each  com- 
ponent. If  the  two  components  are  now  made  to  interfere,  we  should 
require  a photon  in  one  component  to  be  able  to  interfere  with  one  in 
I the  other.  Sometimes  these  two  photons  would  have  to  annihilate  one 
another  and  other  times  they  would  have  to  produce  four  photons. 
This  would  contradict  the  conservation  of  energy.  The  new  theory, 
which  connects  the  wave  function  with  probabilities  for  one  photon, 
gets  over  the  difficulty  by  making  each  photon  go  partly  into  each  of 
the  two  components.  Each  photon  then  interferes  only  with  itself. 
Interference  between  two  different  photons  never  occurs. 

The  association  of  particles  with  waves  discussed  above  is  not 
restricted  to  the  case  of  light,  but  is,  according  to  modern  theory, 
of  universal  applicability.  All  kinds  of  particles  are  associated  with 
waves  in  this  way  and  conversely  all  wave  motion  is  associated  with  • 
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particles.  Thus  all  particles  can  be  made  to  exhibit  interference 
effects  and  all  wave  motion  has  its  energy  in  the  form  of  quanta.  The 
reason  why  these  general  phenomena  are  not  more  obvious  is  on 
account  of  a law  of  proportionality  between  the  mass  or  energy  of  the 
particles  and  the  frequency  of  the  waves,  the  coefficient  being  such 
that  for  waves  of  familiar  frequencies  the  associated  quanta  are 
extremely  small,  while  for  particles  even  as  light  as  electrons  the 
associated  wave  frequency  is  so  high  that  it  is  not  easy  to  demonstrate 
interference. 

4.  Superposition  and  indeterminacy 

The  reader  may  possibly  feel  dissatisfied  with  the  attempt  in  the 
two  preceding  sections  to  fit  in  the  existence  of  photons  with  the 
classical  theory  of  light.  He  may  argue  that  a very  strange  idea  has 
been  introduced — the  possibility  of  a photon  being  partly  in  each  of 
two  states  of  polarization,  or  partly  in  each  of  two  separate  beams — 
but  even  with  the  help  of  this  strange  idea  no  satisfying  picture  of 
the  fundamental  single-photon  processes  has  been  given.  He  may  say 
further  that  this  strange  idea  did  not  provide  any  information  about 
experimental  results  for  the  experiments  discussed,  beyond  what 
could  have  been  obtained  from  an  elementary  consideration  of 
photons  being  guided  in  some  vague  way  by  waves.  What,  then,  is 
the  use  of  the  strange  idea  ? 

In  answer  to  the  first  criticism  it  may  be  remarked  that  the  main 
object  of  physical  science  is  not  the  provision  of  pictures,  but  is  the 
formulation  of  laws  governing  phenomena  and  the  application  of 
these  laws  to  the  discovery  of  new  phenomena.  If  a picture  exists, 
so  much  the  better;  but  whether  a picture  exists  or  not  is  a matter 
of  only  secondary  importance.  In  the  case  of  atomic  phenomena 
no  picture  can  be  expected  to  exist  in  the  usual  sense  of  the  word 
‘picture’,  by  which  is  meant  a model  functioning  essentially  on 
classical  lines.  One  may,  however,  extend  the  meaning  of  the  word 
‘picture’  to  include  any  way  of  looking  at  the  fundamental  laws  which 
makes  their  self-consistency  obvious.  With  this  extension,  one  may 
gradually  acquire  a picture  of  atomic  phenomena  by  becoming 
familiar  with  the  laws  of  the  quantum  theory. 

With  regard  to  the  second  criticism,  it  may  be  remarked  that  for 
many  simple  experiments  with  light,  an  elementary  theory  of  waves 
and  photons  connected  in  a vague  statistical  way  would  be  adequate 
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to  account  for  the  results.  In  the  case  of  such  experiments  quantum 
mechanics  has  no  further  information  to  give.  In  the  great  majority 
of  experiments,  however,  the  conditions  are  too  complex  for  an 
elementary  theory  of  this  kind  to  be  applicable  and  some  more 
elaborate  scheme,  such  as  is  provided  by  quantum  mechanics,  is  then 
needed.  The  method  of  description  that  quantum  mechanics  gives 
in  the  more  complex  cases  is  applicable  also  to  the  simple  cases  and 
although  it  is  then  not  really  necessary  for  accounting  for  the  experi- 
mental results,  its  study  in  these  simple  cases  is  perhaps  a suitable 
introduction  to  its  study  in  the  general  case. 

There  remains  an  overall  criticism  that  one  may  make  to  the  whole 
scheme,  namely,  that  in  departing  from  the  determinacy  of  the 
classical  theory  a great  complication  is  introduced  into  the  descrip- 
tion of  Nature,  whioh  is  a highly  undesirable  feature.  This  complica- 
tion is  undeniable,  but  it  is  offset  by  a great  simplification,  provided 
by  the  general  principle,  of  superposition  of  states,  which  we  shall  now 
go  on  to  consider.  But  first  it  is  necessary  to  make  precise  the  impor- 
tant concept  of  a ‘state’  of  a general  atomic  system. 

Let  us  take  any  atomic  system,  composed  of  particles  or  bodies 
with  specified  properties  (mass,  moment  of  inertia,  etc.)  interacting 
according  to  specified  laws  of  force.  There  will  be  various  possible 
motions  of  the  particles  or  bodies  consistent  with  the  laws  of  force. 
Each  such  motion  is  called  a state  of  the  system.  According  to 
classical  ideas  one  could  specify  a state  by  giving  numerical  values 
to  all  the  coordinates  and  velocities  of  the  various  component  parts 
of  the  system  at  some  instant  of  time,  the  whole  motion  being  then 
completely  determined.  Now  the  argument  of  pp.  3 and  4 shows  that 
we  cannot  observe  a small  system  with  that  amount  of  detail  which 
siassical  theory  supposes.  The  limitation  in  the  power  of  observation 
ats  a limitation  on  the  number  of  data  that  can  be  assigned  to  a 
state.  Thus  a state  of  an  atomic  system  must  be  specified  by  fe  wer 
or  more  indefinite  data  than  a complete  set  of  numerical  values 
for  all  the  coordinates  and  velocities  at  some  instant  of  time.  In  the 
case  when  the  system  is  just  a single  photon,  a state  would  be  com- 
pletely specified  by  a given  translational  state  in  the  sense  of  § 3 
together  with  a given  state  of  polarization  in  the  sense  of  § 2. 

[W- J a state  of  a system  may  be  defined  as  an  undisturbed  motion  that 
is  restricted  by  as  many  conditions  or  data  as  are  theoretically 
possible  without  mutual  interference  or  contradiction.  In  practice 
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the  conditions  could  be  imposed  by  a suitable  preparation  of  the 
system,  consisting  perhaps  in  passing  it  through  various  kinds  of 
sorting  apparatus,  such  as  slits  and  polarimeters,  the  system  being 
left  undisturbed  after  the  preparation.  The  word  ‘state’  may  be 
used  to  mean  either  the  state  at  one  particular  time  (after  the 
preparation),  or  the  state  throughout  the  whole  of  time  after  the 
preparation.  To  distinguish  these  two  meanings,  the  latter  will  be 
called  a ‘state  of  motion’  when  there  is  liable  to  be  ambiguity. 

The  general  principle  of  superposition  of  quantum  mechanics 
applies  to  the  states,  with  either  of  the  above  meanings,  of  any  one 
dynamical  system.  It  requires  us  to  assume  tliat  between  these 
states  there  exist  peculiar  relationships  such  that  whenever  the 
system  is  definitely  in  one  state  we  can  consider  it  as  being  partly 
m each  of  two  or  more  other  states.  The  original  state  must  be 
regarded  as  the  result  of  a kind  of  superposition  of  the  two  or  more 
new  states,  in  a way  that  cannot  be  conceived  on  classical  ideas.  Any 
state  may  be  considered  as  the  result  of  a superposition  of  two  or 
more  other  states,  and  indeed  in  an  infinite  number  of  ways.  Con- 
versely any  two  or  more  states  may  be  superposed  to  give  a new 
state.  The  procedure  of  expressing  a state  as  the  result  of  super- 
position of  a number  of  other  states  is  a mathematical  procedure 
that  is  always  permissible,  independent  of  any  reference  to  physical 
conditions,  like  the  procedure  of  resolving  a wave  into  Fourier  com- 
ponents. Whether  it  is  useful  in  any  particular  case,  though,  depends 
on  the  special  physical  conditions  of  the  problem  under  consideration. 

In  the  two  preceding  sections  examples  were  given  of  the  super- 
position principle  applied  to  a system  consisting  of  a single  photon. 

§ 2 dealt  with  states  differing  only  with  regard  to  the  polarization  and 
§ 3 with  states  differing  only  with  regard  to  the  motion  of  the  photon 
as  a whole.  ' 


The  nature  of  the  relationships  which  the  superposition  principle 
requires  to  exist  between  the  states  of  any  system  is  of  a kind  that 
cannot  be  explained  in  terms  of  familiar  physical  concepts.  One 
cannot  in  the  classical  sense  picture  a system  being  partly  in  each  of 
two  states  and  see  the  equivalence  of  this  to  the  system  being  com- 
pletely in  some  other  state.  There  is  an  entirely  new  idea  involved, 
to  which  one  must  get  accustomed  and  in  terms  of  which  one  must 
roeeed  to  build  up  an  exact  mathematical  theory,  without  having 
any  detailed  classical  picture. 
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When  a state  is  formed  by  the  superposition  of  two  other  states, 
it  will  have  properties  that  are  in  some  vague  way  intermediate 
between  those  of  the  two  original  states  and  that  approach  more  or 
less  closely  to  those  of  either  of  them  according  to  the  greater  or  less 
‘weight’  attached  to  this  state  in  the  superposition  process.  The  new 
state  is  completely  defined  by  the  two  original  states  when  their 
relative  weights  in  the  superposition  process  are  known,  together 
with  a certain  phase  difference,  the  exact  meaning  of  weights  and 
phases  being  provided  in  the  general  Me  by  the  mathematical  theory. 
In  the  case  of  the  polarization  of  a photon  their  meaning  is  that  pro- 
vided by  classical  optics,  so  that,  for  example,  when  two  perpendicu- 
larly plane  polarized  states  are  superposed  with  equal  weights,  the 
new  state  may  be  circularly  polarized  in  either  direction,  or  linearly 
polarized  at  an  angle  £tt,  or  else  elliptically  polarized,  according  to 
the  phase  difference. 


The  non-classical  nature  of  the  superposition  process  is  brought 
out  clearly  if  we  consider  the  superposition  of  two  states,  A and  B, 
such  that  there  exists  an  observation  which,  when  made  on  the 
system  in  state  A,  is  certain  to  lead  to  one  particular  result,  a say,  and 
when  made  on  the  system  in  state  B is  certain  to  lead  to  some  different 
result,  b say.  What  will  be  the  result  of  the  observation  when  made 
on  the  system  in  the  superposed  state?  The  answer  is  that  the  result 
will  be  sometimes  a and  sometimes  6,  according  to  a probability  law 
depending  on  the  relative  weights  of  A and  B in  the  superposition  C 
process.  It  will  never  be  different  from  both  a and  b.  The  inter- 
mediate character  of  the.  state,  formed  by  superposition  thus  expresses 
itself  through  the  probability  of  a particular  result  for  an  observation 
being  intermediate  between  the  corresponding  probabilities  for  the  original 
states, f not  through  the  result  itself  being  intermediate  between  the 

corresponding  results  for  the  original  states.  i j 

In  this  way  we  see  that  such  a drastic  depafed;  from  ordinary 
ideas  as  the  assumption  of  superposition  relationships  between  the 
states  is  possible  only  on  account  of  the  recognition  of  the  importance 
of  the  disturbance  accompanying  an  observation  and  of  the  conse- 
quent indeterminacy  in  the  result  of  the  observation.  When  an 
observation  is  made  on  any  atomic  system  that  is  in  a given  state,  ! 

t The  probability  of  a particular  result  for  the  state  formed  by  superposition  is  not 
:tTemrredlate  ,between  those  for  the  states  in  the  genial  case  when 

°r>8'ni  8tateS  T'  ,Wt  Zer°  °r  UnUy'  so  tbere  Hre  restrictions  on  the 
mtermediateness  of  a state  formed  by  superposition. 
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in  general  the  result  will  not  be  determinate,  i.e.,  if  the  experiment 
is  repeated  several  times  under  identical  conditions  several  different 
results  may  be  obtained.  It  is  a law  of  nature,  though,  that  if  the 
experiment  is  repeated  a large  number  of  times,  each  particular  result 
will  be  obtained  in  a definite  fraction  of  the  total  number  of  times,  so 
that  there  is  a definite  probability  of  its  being'  obtained.  This  proba- 
bility is  what  the  theory  sets  out  to  calculate.  Only  in  special  cases 
when  the  probability  for  some  result  is  unity  is  the  result  of  the 
experiment  determinate. 

The  assumption  of  superposition  relationships  between  the  states 
leads  to  a mathematical  theory  in  which  the  equations  that  define 
a state  are  linear  in  the  unknowns.  In  consequence  of  this,  people 
have  tried  to  establish  analogies  with  systems  in  classical  mechanics, 
such  as  vibrating  strings  or  membranes,  which  are  governed  by  linear 
equations  and  for  which,  therefore,  a superposition  principle  holds. 
Such  analogies  have  led  to  the  name  ‘Wave  Mechanics’  being  some- 
times given  to  quantum  mechanics.  It  is  important  to  remember, 
however,  that  the.  superposition  that  occurs  in  quantum  mechanics  is 
of  an  essentially  different  nature  from  any  occurring  in  the  classical 
theory , as  is  shown  by  the  fact  that  the  quantum  superposition  prin- 
ciple demands  indeterminacy  in  the  results  of  observations  in  order 
to  be  capable  of  a sensible  physical  interpretation.  The  analogies  are 
thus  liable  to  be  misleading. 

5.  Mathematical  formulation  of  the  principle 

A profound  change  has  taken  place  during  the  present  century  in 
the  opinions  physicists  have  held  on  the  mathematical  foundations 
of  their  subject.  Previously  they  supposed  that  the  principles  of 
Newtonian  mechanics  would  provide  the  basis  for  the  description 
of  the  whole  of  physical  phenomena  and  that  all  the  theoretical 
physicist  had  to  do  was  suitably  to  develop  and  apply  these  prin- 
ciples. With  the  recognition  that  there  is  no  logical  reason  why 
Newtonian  and  other  classical  principles  should  oe  valid  outside  the 
domains  in  which  they  have  been  experimentally  verified  has  come 
the  realization  that  departures  from  these  principles  are  indeed 
necessary.  Such  departures  find  their  expression  through  the  intro- 
duction of  new  mathematical  formalisms,  new  schemes  of  axioms 
and  rules  of  manipulation,  into  the  methods  of  theoretical  physics. 

Quantum  mechanics  provides  a good  example  of  the  new  ideas.  It 
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requires  the  states  of  a dynamical  system  and  the  dynamical  variables 
to  be  interconnected  in  quite  strange  ways  that  are  unintelligible 
from  the  classical  standpoint.  The  states  and  dynamical  variables 
have  to  be  represented  by  mathematical  quantities  of  different 
natures  from  those  ordinarily  used  in  physics.  The  new  scheme 
becomes  a precise  physical  theory  when  all  the  axioms  and  rules  of 
manipulation  governing  the  mathematical  quantities  are  specified 
and  when  in  addition  certain  laws  are  laid  down  connecting  physical 
facts  with  the  mathematical  formalism,  so  that  from  any  given 
physical  conditions  equations  between  the  mathematical  quantities 
may  be  inferred  and  vice  versa.  In  an  application  of  the  theory  one 
would  be  given  certain  physical  information,  which  one  would  pro- 
ceed to  express  by  equations  between  the  mathematical  quantities. 
One  would  then  deduce  new  equations  with  the  help  of  the  axioms 
and  rules  of  manipulation  and  would  conclude  by  interpreting  these 
new  equations  as  physical  conditions.  The  justification  for  the  whole 
scheme  depends,  apart  from  internal  consistency,  on  the  agreement 
of  the  final  results  with  experiment. 

We  shall  begin  to  set  up  the  scheme  by  dealing  with  the  mathe- 
matical relations  between  the  states  of  a dynamical  system  at  one 
instant  of  time,  which  relations  will  come  from  the  mathematical 
formulation  of  the  principle  of  superposition.  The  superposition  pro- 
cess is  a kind  of  additive  process  and  implies  that  states  can  in  some 
way  be  added  to  give  new'  states.  The  states  must  therefore  be  con- 
nected with  mathematical  quantities  of  a kind  which  can  be  added 
together  to  give  other  quantities  of  the  same  kind.  The  most  obvious 
of  such  quantities  are  vectors.  Ordinary  vectors,  existing  in  a space 
=jf  a finite  number  of  dimensions,  are  not  sufficiently  general  for 
most  of  the  dynamical  systems  in  quantum  mechanics.  We  have  to  ' 
make  a generalization  to  vectors  in  a space  of  an  infinite  number  of 
dimensions,  and  the  mathematical  treatment  becomes  complicated 
by  questions  of  convergence.  For  the  present,  however,  we  shall  deal 
merely  with  some  general  properties  of  the  vectors,  properties  which 
can  be  deduced  on  the  basis  of  a simple  scheme  of  axioms,  and 
questions  of  convergence  and  related  topics  will  not  be  gone  into 
until  the  need  arises. 

It  is  desirable  to  have  a special  name  for  describing  the  vectors 
which  are  connected  with  the  states  of  a system  in  quantum  mecha- 
nics, w'hether  they  are  in  a space  of  a finite  or  an  infinite  number  of 
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dimensions.  We  shall  call  them  ket  vectors,  or  simply  kets,  and  denote 
a general  one  of  them  by  a special  symbol  |>.  If  we  want  to  specify 
a particular  one  of  them  by  a label,  A say,  we  insert  it  in  the  middle, 
thus  |A>.  The  suitability  of  this  notation  will  become  clear  as  the 
scheme  is  developed. 

Ket  vectors  may  be  multiplied  by  complex  numbers  and  may  be 
added  together  to  give  other  ket  vectors,  e.g.  from  two  ket  vectors 
|A>  and  \By  we  can  form 

^{Ay+c^B}  = jjR),  (1) 

say,  where  and  c2  are  any  two  complex  numbers.  We  may  also 
perform  more  general  linear  processes  with  them,  such  as  adding  an 
infinite  sequence  of  them,  and  if  we  have  a ket  vector  | xy,  depending 
on  and  labelled  by  a parameter  x which  can  take  on  all  values  in  a 
certain  range,  we  may  integrate  it  with  respect  to  x,  to  get  another 

tot  vector  J |*>  dx  - |0>  ' 

say.  A ket  vector  which  is  expressible  linearly  in  terms  of  certain 
others  is  said  to  be  dependent  on  them.  A set  of  ket  vectors  are  called 
independent  if  no  one  of  them  is  expressible  linearly  in  terms  Qf  the 

others.  ■ .•  ' ' • 

We  now  assume  that  each  state  of  a dynamical  system  at  a particular 
time  corresponds  to  a ket  vector,  the  correspondence  being  such  that  if  a 
state  results  from  the  superposition  of  certain  other  states,  its  correspond- 
ing ket  vector  is  expressible  linearly  in  terms  of  the  corresponding  ket 
vectors  of  the  other  states,  and  conversely.  Thus  the  state  R results  from 
a superposition  of  the  states  A and  B when  the  corresponding  ket 
vectors  are  connected  by  (1). 

The  above  assumption  leads  to  certain  properties  of  the  super- 
position process,  properties  which  are  in  fact  necessary  for  the  word 
‘superposition’  to  be  appropriate.  When  two  or  more  states  are 
superposed,  the  order  in  which  they  occur  in  the  superposition 
process  is  unimportant,  so  the  superposition  process  is  symmetrical 
between  the  states  that  are  superposed.  Again,  we  see  from  equation 
(1)  that  (excluding  the  case  when  the  coefficient  c1  or  ca  is  zero)  if 
the  state  R can  be  formed  by  superposition  of  the  states  A and  B, 
then  the  state  A can  be  formed  by  superposition  of  B and  R,  and  B 
can  be  formed  by  superposition  of  A and  R.  The  superposition 
relationship  is  symmetrical  between  all  three  states  A,  B,  and  R. 
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A state  which  results  from  the  superposition  of  certain  other 
states  will  be  said  to  be  dependent  on  those  states.  More  generally, 
a state  will  be  said  to  be  dependent  on  any  set  of  states,  finite  or 
infinite  in  number,  if  its  corresponding  ket  vector  is  dependent  on 
the  corresponding  ket  vectors  of  the  set  of  states.  A set  of  states 
will  be  called  independent  if  no  one  of  them  is  dependent  on  the 
others. 

To  proceed  with  the  mathematical  formulation  of  the  superposition 
principle  we  must  introduce  a further  assumption,  namely  the  assump- 
tion that  by  superposing  a state  with  itself  we  cannot  form  any  new 
state,  but  only  the  original  state  over  again.  If  the  original  state 
corresponds  to  the  ket  vector  | Ay,  when  it  is  superposed  with  itself 
the  resulting  state  will  correspond  to  , ^ j '*  ' 

cy  |A>+c2  |A>  = (Ci+CjJlA),  !->-  ’ *'■ 

where  cx  and  c2  are  numbers.  Now  we  may  have  c1+c2  = 0,  in  which 
case  the  result  of  the  superposition  process  would  be  nothing  at  all, 
the  two  components  having  cancelled  each  other  by  an  interference 
effect.  Our  new  assumption  requires  that,  apart  from  this  special 
case,  the  resulting  state  must  be  the  same  as  the  original  one,  so  that 
must  correspond  to  the  same  state  that  \A')  does.  Now 
Cj+Cjj  is  an  arbitrary  complex  number  and  hence  we  can  conclude 
that  if  the  ket  vector  corresponding  to  a state  is  multiplied  by  any 
complex  number,  not  zero,  the  resulting  ket  vector  will  correspond  to  the 
same  state.  Thus  a state  is  specified  by  the  direction  of  a ket  vector 
and  any  length  one  may  assign  to  the  ket  vector  is  irrelevant.  All 
the  states  of  the  dynamical  system  are  in  one-one  correspondence 
with  all  the  possible  directions  for  a ket  vector,  no  distinction  being 
made  between  the  directions  of  the  ket  vectors  j Ay  and  [A). 

The  assumption  just  made  shows  up  very  clearly  the  fundamental 
difference  between  the  superposition  of  the  quantum  theory  and  any 
kind  of  classical  superposition.  In  the  case  of  a classical  system  for 
which  a superposition  principle  holds,  for  instance  a vibrating  mem- 
brane, when  one  superposes  a state  with  itself  the  result  is  a different 
state,  with  a different  magnitude  of  the  oscillations.  There  is  no 
physical  characteristic  of  a quantum  state  corresponding  to  the 
magnitude  of  the  classical  oscillations,  as  distinct  from  their  quality, 
described  by  the  ratios  of  the  amplitudes  at  different  points  of 
the  membrane.  Again,  while  there  exists  a classical  state  with  zero 


18  THE  PRINCIPLE  OF  SUPERPOSITION  §6 

amplitude  of  osoillation  everywhere,  namely  the  state  of  rest,  there 
does  not  exist  any  corresponding  state  for  a quantum  system,  the 
zero  ket  vector  corresponding  to  no  state  at  all. 

Given  two  states  corresponding  to  the  ket  veetorsz^a])  and  \B') , 
the  general  state  formed  by  superposing  them  corresponds  to  a ket 
vector  \Ry  which  is  determined  by  two  complex  numbers,  namely 
the  coefficients  cx  and  c2  of  equation  (1).  If  these  two  coefficients  are 
multiplied  by  the  same  factor  (itself  a complex  number),  the  ket 
vector  | R}  will  get  multiplied  by  this  factor  and  the  corresponding 
state  will  be  unaltered.  Thus  only  the  ratio  of  the  two  coefficients 
is  effective  in  determining  the  state  R.  Hence  this  state  is  deter- 
mined by  one  complex  number,  or  by  two  real  parameters.  Thus 
from  two  given  states,  a twofold  infinity  of  states  may  be  obtained 
by  superposition. 

This  result  is  confirmed  by  the  examples  discussed  in  §§  2 and  3. 
In  the  example  of  § 2 there  are  just  two  independent  states  of  polari- 
zation for  a photon,  which  may  be  taken  to  be  the  states  of  plane 
polarization  parallel  and  perpendicular  to  some  fixed  direction,  and 
from  the  superposition  of  these  two  a twofold  infinity  of  states  of 
polarization  can  be  obtained,  namely  all  the  states  of  elliptic  polari- 
zation, the  general  one  of  which  requires  two  parameters  to  describe 
it.  Again,  in  the  example  of  § 3,  from  the  superposition  of  two  given 
translational  states  for  a photon  a twofold  infinity  of  translational 
states  may  be  obtained,  the  general  one  of  which  is  described  by  two 
parameters,  which  may  be  taken  to  be  the  ratio  of  the  amplitudes 
of  the  two  wave  functions  that  are  added  together  and  their  phase 
relationship.  This  confirmation  shows  the  need  for  allowing  complex 
coefficients  in  equation  (1).  If  these  coefficients  were  restricted  to  be 
real,  then,  since  only  their  ratio  is  of  importance  for  determining  the 
direction  of  the  resultant  ket  vector  | Ry  when  \Ay  and  \By  are 
given,  there  would  be  only  a simple  infinity  of  states  obtainable  from 
the  superposition. 

6.  Bra  and  ket  vectors 

Whenever  we  have  a set  of  vectors  in  any  mathematical  theory, 
we  can  always  set  up  a second  set  of  vectors,  which  mathematicians 
call  the  dual  vectors.  The  procedure  will  be  described  for  the  case 
when  the  original  vectors  are  our  ket  vectors. 

Suppose  we  have  a number  <f>  which  is  a function  of  a ket  vector 
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\A),  i.e.  to  each  ket  vector  |A>  there  corresponds  one  number  <j>, 
and  suppose  further  that  the  function  is  a linear  one,  which  means 
that  the  number  corresponding  to  |A>-f  \A")  is  'the  sum  of  the 
numbers  corresponding  to  \Ay  and  to  \A'y,  and  the  number  corre- 
sponding to  c|A>  is  c times  the  number  corresponding  to  \A'),  c 
being  any  numerical  factor.  Then  the  number  <f>  corresponding  to 
any  \A'y  may  be  looked  upon  as  the  scalar  product  of  that  \A ) with  C 
some  new  vector,  there  being  one  of  these  new  vectors  for  each  linear 
function  of  the  ket  vectors  \A}.  The  justification  for  this  way  of 
looking  at  <f>  is  that,  as  will  be  seen  later  (see  equations  (5)  and  (6)), 

' the  new  vectors  may  be  added  together  and  may  be  multiplied  by  ■ 
numbers  to  give  other  vectors  of  the  same  kind.  The  new  vectors 
are,  of  course,  defined  only  to  the  extent  that  their  scalar  products 
with  the  original  ket  vectors  are  given  numbers,  but  this  is  suffi- 
cient for  one  to  be  able  to  build  up  a mathematical  theory  about 
them. 

We  shall  call  the  new  vectors  bra  vectors,  or  simply  bras,  and  denote 
a general  one  of  them  by  the  symbol  <j,  the  mirror  image  of  the 
symbol  for  a ket  vector.  If  we  want  to  specify  a particular  one  of 
them  by  a label,  B say,  we  write  it  in  the  middle,  thus  <f?|.  The 
scalar  product  of  a bra  vector  <i5|  and  a ket  vector  \Ay  will  be 
w'ritten  <U|.4>,  i.e.  as  a juxtaposition  of  the  symbols  for  the  bra 
and  ket  vectors,  that  for  the  bra  vector  being  on  the  left,  and  the 
two  vertical  lines  being  contracted  to  one  for  brevity. 

One  may  look  upon  the  symbols  < and  > as  a distinctive  kind  of 
brackets.  A scalar  product  <i?|A>  now  appears  as  a complete  bracket 
expression  and  a bra  vector  (B\  or  a ket  vector  | Ay  as  an  incomplete 
bracket  expression.  We  have  the  rules  that  any  complete  bracket 
expression  denotes  a number  and  any  incomplete  bracket  expression 
denotes  a vector,  of  the  bra  or  ket  kind  according  to  whether  it  contains 
the  first  or  second  part  of  the  brackets. 

The  condition  that  the  scalar  product  of  (B\  and  |A)  is  a linear 
function  of  \A")  may  be  expressed  symbolically  by 

<£|{|A>+ 1.4'>}  = <B\Ay+<B\A’y,  (2) 

= c(B\Ay,  (3) 

c being  any  number. 

A bra  vector  is  considered  to  be  completely  defined  when  its  scalar 
product  with  every  ket  vector  is  given,  so  that  if  a bra  vector  has  its 
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scalar  product  with  every  ket  vector  vanishing,  the  bra  vector  itself 

must  be  considered  as  vanishing.  In  symbols,  if 
? 

<P\A>  = 0,  all  | Ay, 
then  <P|  = 0. 

The  sum  of  two  bra  vectors  (5 1 and  (B’\  is  defined  by  the  condition 
that  its  scalar  product  with  any  ket  vector  \A}  is  the  sum  of  the 
scalar  products  of  (B\  and  (B'\  with  \A}, 

{<JB|+<B'|}!^>  = <B\A}+<B'\A},  (5) 

and  the  product  of  a bra  vector  <_Bj  and  a number  c is  defined  by  the 
condition  that  its  scalar  product  with  any  ket  vector  |^4)  is  c times 
the  scalar  product  of  <f?|  with  | A}, 

{c(B\}\A>  = c(B\Ay.  (6) 

Equations  (2)  and  (5)  show  that  products  of  bra  and  ket  vectors 
satisfy  the  distributive  axiom  of  multiplication,  and  equations  (3) 
and  (6)  show  that  multiplication  by  numerical  factors  satisfies  the 
usual  algebraic  axioms. 

The  bra  vectors,  as  they  have  been  here  introduced,  are  quite  a 
different  kind  of  vector  from  the  kets,  and  so  far  there  is  no  connexion 
between  them  except  for  the  existence  of  a scalar  product  of  a bra 
and  a ket.  We  now  make  the  assumption  that  there  is  a one-one 
correspondence  between  the  bras  and  the  kets,  such  that  the  bra  corre- 
sponding to  \A}-\-  \A'y  is  the  sum  of  the  bras  corresponding  to  \Ay  and 
to  \A’y,  and  the  bra  corresponding  to  c\A)  is  c times  the  bra  corre- 
sponding to  | Ay,  c being  the  conjugate  complex  number  to  c.  We  shall 
use  the  same  label  to  specify  a ket  and  the  corresponding  bra.  Thus 
the  bra  corresponding  to  | Ay  will  be  written  (A  |. 

The  relationship  between  a ket  vector  and  the  corresponding  bra 
makes  it  reasonable  to  call  one  of  them  the  conjugate  imaginary  of 
the  other.  Our  bra  and  ket  vectors  are  complex  quantities,  since  they 
can  be  multiplied  by  complex  numbers  and  are  then  of  the  same 
nature  as  before,  but  they  are  complex  quantities  of  a special  kind 
which  cannot  be  split  up  into  real  and  pure  imaginary  parts.  The 
usual  method  of  getting  the  real  part  of  a complex  quantity,  by 
taking  half  the  sum  of  the  quantity  itself  and  its  conjugate,  cannot 
be  applied  since  a bra  and  a ket  vector  are  of  different  natures  and 
cannot  be  added  together.  To  call  attention  to  this  distinction,  we 
shall'  use  the  words  ‘conjugate  complex’  to  refer  to  numbers  and 


Se 


BRA  AND  KET  VECTORS 


21 


other  complex  quantities  which  can  be  split  up  into  real  and  pure 
imaginary  parts,  and  the  words  ‘conjugate  imaginary’  for  bra  and 
ket  vectors,  which  cannot.  With  the  former  kind  of  quantity,  we 
shall  use  the  notation  of  putting  a bar  over  one  of  them  to  get  the 
conjugate  complex  one. 

On  account  of  the  one-one  correspondence  between  bra  vectors  and 
ket  vectors,  any  state  of  our  dynamical  system  at  a particular  time  may 
be  specified  by  the  direction  of  a bra  vector  just  as  well  as  by  the  direction 
of  a ket  vector.  In  fact  the  whole  theory  will  be  symmetrical  in  its 
essentials  between  bras  and  kets. 

Given  any  two  ket  vectors  Ld  > and  IB,.  we  can  construct  from 
them  a number  (B\Ay  by  taking  the  scalar  product  of  the  first  with 
the  conjugate  imaginary  of  the  second.  This  number  depends  linearly 
on  | Ay  and  antilinearly  on  | B),  the  antiiinear  dependence  meaning 
that  the  number  formed  from  | By-r  j B'y  is  the  sum  of  the  numbers 
formed  from  \B)  and  from  |JT>,  and  the  number  formed  from  c\B) 
is  c times  the  number  formed  from  j B}.  There  is  a second  way  in 
which  we  can  construct  a number  which  depends  linearly  on  [A)  and 
antilinearly  on  \Bj,  namely  by  forming  the  scalar  product  of  | B) 
with  the  conjugate  imaginary  of  \A}  and  taking  the  conjugate  com- 
plex of  this  scalar  product.  We  assume  that  these  two  numbers  are 
always  equal,  i.e. 


<B\A)  — <Jl  | J3>. 


(7) 


Putting  |J3>  = |j4>  here,  we  find  that  the  number  (A \A)  must  be 
real.  We  make  the  further  assumption 

(A  \ A y i>  0,  (8) 

except  when  1^4)  = 0. 

In  ordinary  space,  from  any  two  vectors  one  can  construct  a 
number — their  scalar  product— wliich  is  a real  number  and  is  sym- 
metrical between  them.  In  the  space  of  bra  vectors  or  the  space  of 
ket  vectors,  from  any  two  vectors  one  can  again  construct  a number 
— the  scalar  product  of  one  with  the  conjugate  imaginary  of  the 
other — but  this  number  is  complex  and  goes  over  into  the  conjugate 
complex  number  when  the  two  vectors  are  interchanged.  There  is 
thus  a kind  of  perpendicularity  in  these  spaces,  which  is  a generaliza- 
tion of  the  perpendicularity  in  ordinary  space.  We  shall  call  a bra 
and  a ket  vector  orthogonal  if  their  scalar  product  is  zero,  and  two 
bras  or  two  kets  will  be  called  orthogonal  if  the  scalar  prod  net,  of  one 
with  the  conjugate  imaginary  of  the  other  is  zero.  Further,  we  shall 
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say  that  two  states  of  our  dynamical  system  are  orthogonal  if  the 
vectors  corresponding  to  these  states  are  orthogonal. 

The  length  of  a bra  vector  < A | or  of  the  conjugate  imaginary  ket 
vector  \A}  is  defined  as  the  square  root  of  the  positive  number 
\A  \A).  When  we  are  given  a state  and  wish  to  set  up  a bra  or  ket 
vector  to  correspond  to  it,  only  the  direction  of  the  vector  is  given 
and  the  vector  itself  is  undetermined  to  the  extent  of  an  arbitrary 
numerical  factor.  It  is  often  convenient  to  choose  this  numerical 
factor  so  that  the  vector  is  of  length  unity.  This  procedure  is  called 
normalization  and  the  vector  so  chosen  is  said  to  be  normalized.  The 
vector  is  not  completely  determined  even  then,  since  one  can  still 
multiply  it  by  any  number  of  modulus  unity,  i.e.  any  number /‘V 
where  y is  real,  without  changing  its  length.  We  shall  call  such  a 
number  a phase  factor. 

The  foregoing  assumptions  give  the  complete  scheme  of  relations 
between  the  states  of  a dynamical  system  at  a particular  time.  The 
relations  appear  in  mathematical  form,  but  they  imply  physical 
condit  ions,  which  will  lead  to  results  expressible  in  terms  of  observa- 
tions when  the  theory  is  developed  further.  For  instance,  if  two  states 
are  orthogonal,  it  means  at  present  simply  a certain  equation  in  our 
formalism,  but  this  equation  implies  a definite  physical  relationship 
between  the  states,  which  further  developments  of  the  theory  will 
enable  us  to  interpret  in  terms  of  observational  results  (see  the 
bottom  of  p.  35). 
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7.  Linear  operators 

Ik  the  preceding  section  we  considered  a number  which  is  a linear 
function  of  a ket  vector,  and  this  led  to  the  concept  of  a bra  vector. 
We  shall  now  consider  a ket  vector  which  is  a linear  function  of  a 
ket  vector,  and  this  will  lead  to  the  concept  of  a linear  operator. 

Suppose  we  have  a ket  | Fy  which  is  a function  of  a ket  |A>,  i.e. 
to  each  ket  | A)  there  corresponds  one  ket  \Fy,  and  suppose  further 
that  the  function  is  a linear  one,  which  means  that  the  \F)  corre- 
sponding to  |A>+ 1 A'}  is  the  sum  of  the  \Fy’s  corresponding  to  |A> 
and  to  | A'},  and  the  \F}  corresponding  to  c\A')  is  c times  the  \F) 
corresponding  to  | Ay,  c being  any  numerical  factor.  Under  these 
conditions,  we  may  look  upon  the  passage  from  |A>  to  \Fy  as  the 
application  of  a linear  operator  to  |A>.  Introducing  the  symbol  a 
for  the  linear  operator,  we  may  write 

I Fy  = «|  Ay, 

in  which  the  result  of  a operating  on  | A}  is  written  like  a product 
of  a with  | A}.  We  make  the  rule  that  in  such  products  the  ket  vector 
must  always  be  put  on  the  right  of  the  linear  operator . The  above 
conditions  of  linearity  may  now  be  expressed  by  the  equations 


(1) 


a{iA>+|A'>}  = a|A>+a|A'>, 

a{clA>}  = Ca|  Ay. 

A linear  operator  is  considered  to  be  completely  defined  when  the 
result  of  its  application  to  every  ket  vector  is  given.  Thus  a linear 
operator  is  to  be  considered  zero  if  the  result  of  its  application  to  every 
ket  vanishes,  and  two  linear  operators  are  to  be  considered  equal  if 
they  produce  the  same  result  when  applied  to  every  ket. 

Linear  operators  can  be  added  together,  the  sum  of  two  linear 
operators  being  defined  to  be  that  linear  operator  which,  operating 
on  any  ket,  produces  the  sum  of  what  the  two  linear  operators 
separately  would  produce.  Thus  a+£  is  defined  by 

{«+P}\  Ay  = <x\Ay+P\A>  (2) 

for  any  | A).  Equation  (2)  and  the  first  of  equations  (1)  show  that 
products  of  linear  operators  with  ket  vectors  satisfy  the  distributive 
axiom  of  multiplication. 
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Linear  operators  can  also  be  multiplied  together,  the  product  of 
two  linear  operators  being  defined  as  that  linear  operator,  the  appli- 
cation of  which  to  any  ket  produces  the  same  result  as  the  application 
of  the  two  linear  operators  successively.  Thus  the  product  is 
defined  as  the  linear  operator  which,  operating  on  any  ket  |A>, 
changes  it  into  that  ket  which  one  would  get  by  operating  first  on 

M>  with  /?,  and  then  on  the  result  of  the  first  operation  with  <x.  In 
symbols 

= a{/SjA>}. 

This  definition  appears  as  the  associative  axiom  of  multiplication  for 
the  triple  product  of  «,  p,  and  | Ay,  and  allows  us  to  write  this  triple 
product  as  ocfi\A}  without  brackets.  However,  this  triple  product  is 
in  general  not  the  same  as  what  we  should  get  if  we  operated  on  \A} 
first  with  a and  then  with  p,  i.e.  in  general  ap\ A)  differs  from  pat\A\ 
so  that  in  general  op  must  differ  from  pa.  The  commutative  axiom  of 
multiplication  does  not  hold  for  linear  operators.  It  may  happen  as  a 
special  case  that  two  linear  operators  £ and  rj  are  such  that  £tj  and 
are  equal.  In  this  case  we  say  that  £ commutes  with  rj,  or  that  £ 
and  rj  commute. 

By  repeated  applications  of  the  above  processes  of  adding  and 
multiplying  linear  operators,  one  can  form  sums  and  products  of 
more  than  two  of  them,  and  one  can  proceed  to  build  up  an  algebra 
with  them.  In  this  algebra  the  commutative  axiom  of  multiplication 
does  not  hold,  and  also  the  product  of  two  linear  operators  may 
vanish  without  either  factor  vanishing.  But  all  the  other  axioms  of 
ordinary  algebra,  including  the  associative  and  distributive  axioms 
of  multiplication,  are  valid,  as  may  easily  be  verified. 

If  we  take  a number  lc  and  multiply  it  into  ket  vectors,  it  appears 
as  a linear  operator  operating  on  ket  vectors,  the  conditions  (1)  being 
fulfilled  with  k substituted  for  a.  A number  is  thus  a special  case  of 
a lmear  operator.  It  has  the  property  that  it  commutes  with  all  linear 
operators  and  this  property  distinguishes  it  from  a general  linear 
operator. 

So  far  we  have  considered  linear  operators  operating  only  on  ket 
vectors.  We  can  give  a meaning  to  their  operating  also  on  bra  vectors, 
m the  following  way.  Take  the  scalar  product  of  any  bra  (B | with 
the  ket  «|A>.  This  scalar  product  is  a number  which  depends 
linearly  on  |A>  and  therefore,  from  the  definition  of  bras,  it  may  be 
considered  as  the  scalar  product  of  \A')  with  some  bra.  The  bra  thus 
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defined  depends  linearly  on  <1?  | , so  we  may  look  upon  it  as  the  result  of 
some  linear  operator  applied  to  <5|.  This  linear  operator  is  uniquely 
determined  by  the  original  linear  operator  a and  may  reasonably  be 
called  the  same  linear  operator  operating  on  a bra.  In  this  way  our 
linear  operators  are  made  capable  of  operating  on  bra  vectors. 

A suitable  notation  to  use  for  the  resulting  bra  when  a operates  on 
the  bra  (B  | is  as  in  this  notation  the  equation  which  defines 
<5|a  is 

{<JSja}|A>  = <B|{«|A>}  (3) 

for  any  | Ay,  which  simply  expresses  the  associative  axiom  of  multi- 
plication for  the  triple  product  of  <P|,  «,  and  | A}.  We  therefore 
make  the  general  rule  that  in  a product  of  a bra  and  a linear  operator, 
the  bra  must  always  be  put  on  the  left.  We  can  now  write  the  triple 
product  of  <2? |,  a,  and  | A}  simply  as  <B\!x\A'>  without  brackets.  It 
may  easily  be  verified  that  the  distributive  axiom  of  multiplication 
holds  for  products  of  bras  and  linear  operators  just  as  well  as  for 
products  of  linear  operators  and  kets. 

There  is  one  further  kind  of  product  which  has  a meaning  in  our 
scheme,  namely  the  product  of  a ket  vector  and  a bra  vector  with 
the  ket  on  the  left,  such  as  |A><5|.  To  examine  this  product,  let  us 
multiply  it  into  an  arbitrary  ket  |P>,  putting  the  ket  on  the  right, 
and  assume  the  associative  axiom  of  multiplication.  The  product  is 
then  |A><P|P>,  which  is  another  ket,  namely  \A)  multiplied  by  the 
number  <P!P>,  and  this  ket  depends  linearly  on  the  ket  |P>.  Thus 
|A><P|  appears  as  a linear  operator  that  can  operate  on  kets.  It 
can  also  operate  on  bras,  its  product  with  a bra  <Q|  on  the  left  being 
<Q|A><P|,  which  is  the  number  <Q\A)  times  the  bra  <B\.  The 
product  |A><P|  is  to  be  sharply  distinguished  from  the  product 
<P|A>  of  the  same  factors  in  the  reverse  order,  the  latter  product 
being,  of  course,  a number. 

We  now  have  a complete  algebraic  scheme  involving  three  kinds 
of  quantities,  bra  vectors,  ket  vectors,  and  linear  operators.  They  can 
be  multiplied  together  in  the  various  ways  discussed  above,  and  the 
associative  and  distributive  axioms  of  multiplication  always  hold, 
but  the  commutative  axiom  of  multiplication  does  not  hold.  In  this 
general  scheme  we  still  have  the  rules  of  notation  of  the  preceding 
section,  that  any  complete  bracket  expression,  containing  < on  the 
left  and  > on  the  right,  denotes  a number,  while  any  incomplete 
bracket  expression,  containing  only  < or  >,  denotes  a vector. 
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With  regard  to  the  physical  significance  of  the  scheme,  we  have 
already  assumed  that  the  bra  vectors  and  ket  vectors,  or  rather  the 
directions  of  these  vectors,  correspond  to  the  states  of  a dynamical 
system  at  a particular  time.  We  now  make  the  further  assumption 
that  the  linear  operators  correspond  to  the  dynamical  variables  at  that 
time.  By  dynamical  variables  are  meant  quantities  such  as  the 
coordinates  and  the  components  of  velocity,  momentum  and  angular 
momentum  of  particles,  and  functions  of  these  quantities  in  fact 
the  variables  in  terms  of  which  classical  mechanics  is  built  up.  The 
new  assumption  requires  that  these  quantities  shall  occur  also  in 
quantum  mechanics,  but  with  the  striking  difference  that  they  are 
now  subject  to  an  algebra  in  which  the  commutative  axiom  of  multiplica- 
tion does  not  hold. 

This  different  algebra  for  the  dynamical  variables  is  one  of  the 
most  important  ways  in  which  quantum  mechanics  differs  from 
classical  mechanics.  We  shall  see  later  on  that,  in  spite  of  this  funda- 
mental difference,  the  dynamical  variables  of  quantum  mechanics 
still  have  many  properties  in  common  with  their  classical  counter- 
parts and  it  will  be  possible  to  build  up  a theory  of  them  closely 
analogous  to  the  classical  theory  and  forming  a beautiful  generaliza- 
tion of  it. 

It  is  convenient  to  use  the  same  letter  to  denote  a dynamical 
variable  and  the  corresponding  linear  operator.  In  fact,  we  may  con- 
sider a dynamical  variable  and  the  corresponding  linear  operator  to 
be  both  the  same  thing,  without  getting  into  confusion. 

8.  Conjugate  relations 

Our  linear  operators  are  complex  quantities,  since  one  can  multiply 
them  by  complex  numbers  and  get  other  quantities  of  the  same  nature. 
Hence  they  must  correspond  in  general  to  complex  dynamical  vari- 
ables, i.e.  to  complex  functions  of  the  coordinates,  velocities,  etc.  We 
need  some  further  development  of  the  theory  to  see  what  kind  of 
linear  operator  corresponds  to  a real  dynamical  variable. 

Consider  the  ket  which  is  the  conjugate  imaginary  of  <P|a.  This 
ket  depends  antilinearly  on  <P|  and  thus  depends  linearly  on  |P>. 
It  may  therefore  be  considered  as  the  result  of  some  linear  operator 
operating  on  ]P>.  This  linear  operator  is  called  the  adjoint  of  a and 
we  shall  denote  it  by  a.  With  this  notation,  the  conjugate  imaginary 
of  <P|a  is  a|P>. 
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In  formula  (7)  of  Chapter  I put  <P j<*  for  <A\  and  its  conjugate 
imaginary  a|P>  for  \Aj.  The  result  is 

<P|a|P>  = <P|a|P>.  ' (4) 

This  is  a general  formula  holding  for  any  ket  vectors  | P>,  |P>  and 
any  linear  operator  a,  and  it  expresses  one  of  the  most  frequently 
used  properties  of  the  adjoint. 

Putting  <5  for  a.  in  (4),  we  get 

<P|5|P>  = <Pla|P>  = <£|«|P>, 

by  using  (4)  again  with  |P>  and  jP>  interchanged.  This  holds  for 
any  ket  |P>,  so  we  can  infer  from  (4)  of  Chapter  I, 

<B|a  = <P|a, 

and  since  this  holds  for  any  bra  vector  <2?|,  we  can  infer 

a = a. 

Thus  the  adjoint  of  the  adjoint  of  a linear  operator  is  the  original  linear 
operator.  This  property  of  the  adjoint  makes  it  like  the  conjugate 
complex  of  a number,  and  it  is  easily  verified  that  in  the  special  case 
when  the  linear  operator  is  a number,  the  adjoint  linear  operator  is 
the  conjugate  complex  number.  Thus  it  is  reasonable  to  assume  that 
the  adjoint  of  a linear  operator  corresponds  to  the  conjugate  complex  of 
a dynamical  variable.  With  this  physical  significance  lor  the  adjoint 
of  a linear  operator,  we  may  call  the  adjoint  alternatively  the  con- 
jugate complex  linear  operator,  which  conforms  with  our  notation  5. 

A linear  operator  may  equal  its  adjoint,  and  is  then  called  self- 
adjoint.  It  corresponds  to  a real  dynamical  variable,  so  it  may  be 
called  alternatively  a real  linear  operator . Any  linear  operator  may 
split  up  into  a real  part  and  a pure  imaginary  part.  For  this 
Reason  the  words  ‘conjugate  complex’  are  applicable  to  linear 
operators  and  not  the  words  ‘conjugate  imaginary’. 

The  conjugate  complex  of  the  sum  of  two  linear  operators  is 
obviously  the  sum  of  their  conjugate  complexes.  To  get  the  conjugate 
complex  of  the  product  of  two  linear  operators  a and  0,  we  apply 
formula  (7)  of  Chapter  I with 

<A|  = <Pk  <-Bj  = <Q|j§, 
so  that  jA>  = a!P>,  |P>  — P\Q)- 

The  result  is  

<<?!£s!P>  - <P|o0|g>  = <Q l°0|P> 
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from  (4).  Since  this  holds  for  any  | P>  and  <Q\,  we  can  infer  that 

= 4-  (5) 

Thus  the  conjugate  complex  of  the  product  of  two  linear  operators  equals 
the  product  of  the  conjugate  complexes  of  the  factors  in  the  reverse  order 
As  simple  examples  of  this  result,  it  should  be  noted  that  if  f and 

(=]!?  ate  fal;  in  §eneral  h is  "0*  real.'1  This  is  an  important  difference 
Lj»rom  classical  mechanics.  However,  fy+Tg  is  real,  and  so  is  »(£*_  „f). 

< inly  when  £ and  y commute  is  f ,,  itself  also  real.  Further  if  £ is  real 
then  so  is  f*  and,  more  generally,  with  n any  positive  integer 
We  may  get  the  conjugate  complex  of  the  product  of  three  linear 
operators  by  successive  applications  of  the  rule  (5)  for  the  conjugate 
complex  of  the  product  of  two  of  them.  We  have 


ocfiy  — x(fiy)  = fiya.  — yfiot,  (6) 

bo  the  conjugate  complex  of  the  product  of  three  linear  operators 
equals  the  product  of  the  conjugate  complexes  of  the  factors  in  the 
reverse  order  . The  rule  may  easily  be  extended  to  the  product  of  any 
number  of  linear  operators. 

In  the  preceding  section  we  saw  that  the  product  |A><Pj  is  a linear 
operator.  We  may  get  its  conjugate  complex  by  referring  directly  to 
the  definition  of  the  adjoint.  Multiplying  \A')(B\  into  a general  bra 
' ■ we  8et  <-PM)<ff|,  whose  conjugate  imaginary  ket  is 

<P\A>\B>  = <A\P>\B}  = \BXA\P>. 

Hence  \A)<B\  = \BXA\.  (7) 

We  now  have  several  rules  concerning  conjugate  complexes  and 
conjugate  imaginanes  of  products,  namely  equation  (7)  of  Chapter  I 
equations  (4),  (5),  (6),  (7)  of  this  chapter,  and  the  rule  that  the 
conjugate  imaginary  of  <P |«  is  «| P>.  These  rules  can  all  be  summed 
up  m a single  comprehensive  rule,  the  conjugate  complex  or  conjugate 
imaginary  of  any  product  of  bra  vectors,  ket  vectors,  and  linear  operators 
is  obtained  by  taking  the  conjugate  complex  or  conjugate  imaginary  of 
each  factor  and  reversing  the  order  of  all  the  factors.  The  rule  is  easily 
verified  to  hold  quite  generally,  also  for  the  cases  not  explicitly  given 


Theorem.  If  f is  a real  linear  operator  and 

f"|P>  = 0 

for  a particular  ket  |P>,  m being  a positive  integer,  then 

t\ P>  = 0. 


(8) 
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To  prove  the  theorem,  take  first  the  case  when  m 
(8)  then  gives  <P|£2lP>  = 0, 

showing  that  the  ket  f \P>  multiplied  by  the  oonjugate  town^- 
7toT.  sero  From  the  assumption  (8)  of  Chapter  I with  f |P>  for  U>. 
<we1^rtf|P>mustbesero.  Thus  the  theorem  is  proved  for  nt  - 2- 

Now  take  m > 2 and  put 

|m-*|P>=|Q>. 

Equation  (8)  now  gives  I2 1 Q>  = °' 

Applying  the  theorem  for  m = 2,  we  get 

Z:Q>  = o 


o. 


(9) 


e— »ip> 

By  repeating  the  process  by  which  equation  (9)  is  obtained  from 
(8),  we  obtain  successively 

{.-.|P>  = o,  £"-»|P>  = 0 Pi r>  = «-  «lp>  = 0' 

and  so  the  theorem  is  proved  generally. 

9.  Eigenvalues  and  eigenvectors 

We  must  make  a further  development  ol  the  theory  o 
operators,  consisting  In  studying  the  equation 

«!  p>  ==  a\py,  (10) 

, co+iafvr  / 1 o\  ignoring  the  trivial  solution  I-*/ 
Zion  (To) tSs  that  the  linear  operator  « applied  to  the  ket 
IP  rt  multiplies  this  ket  by  a numerical  factor  without  changmg 

* " m:1: 

imaginary  form  of  equation  ^ ^ 

<0!*  = b\Q  !• 

, . , H,_..  tv.r.  unknowns  are  the  number  b and  the 

where  b is  a number.  Here  wo  ua.  ku 
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non-zero  bra  <<2|.  Equations  (10)  and  (11)  are  of  such  fundamental 
importance  in  the  theory  that  it  is  desirable  to  have  some  special 
words  to  describe'  the  relationships  between  the  quantities  involved. 
If  (10)  is  satisfied,  we  shall  call  a an  eigenvalue^  of  the  linear  operator 
a,  or  of  the  corresponding  dynamical  variable,  and  we  shall  call  |P> 
an  eigenket  of  the  linear  operator  or  dynamical  variable.  Further,  we 
shall  say  that  the  eigenket  ]P>  belongs  to  the  eigenvalue  a.  Similarly, 
if  (11)  is  satisfied,  we  shall  call  6 an  eigenvalue  of  a and  (Q\  an 
eigenbra  belonging  to  this  eigenvalue.  The  words  eigenvalue,  eigen- 
ket, eigenbra  have  a meaning,  of  course,  only  with  reference  to  a linear 
operator  or  dynamical  variable. 

Using  this  terminology,  we  can  assert  that,  if  an  eigenket  of  a is 
multiplied  by  any  number  not  zero,  the  resulting  ket  is  also  an 
eigenket  and  belongs  to  the  same  eigenvalue  as  the  original  one. 
It  is  possible  to  have  two  or  more  independent  eigenkets  of  a linear 
operator  belonging  to  the  same  eigenvalue  of  that  linear  operator, 
e.g.  equation  (10)  may  have  several  solutions,  |Pl>,  |P2>,  |P3>,...say, 
all  holding  for  the  same  value  of  a,  with  the  various  eigenkets  |Pl>, 
|P2>,  |P3>,...  independent.  In  this  case  it  is  evident  that  any  linear 
combination  of  the  eigenkets  is  another  eigenket  belonging  to  the 
same  eigenvalue  of  the  linear  operator,  e.g. 

Cj  [Pl)-)-c2  \P2.'}-\-ci  iP3>-f- ... 

is  another  solution  ol  (10),  where  c1,c2,c3,...  are  any  numbers. 

In  the  special  case  when  the  linear  operator  a of  equations  (10)  and 
(11)  is  a number,  k say,  it  is  obvious  that  any  ket  |P>  and  bra  <Q| 
will  satisfy  these  equations  provided  a and  6 equal  k.  Thus  a number 
considered  as  a linear  operator  has  just  one  eigenvalue,  and  any  ket 
fk  an  eigenket  and  any  bra  is  an  eigenbra,  belonging  to  this  eigenvalue. 

The  theory  of  eigenvalues  and  eigenvectors  of  a linear  operator  a 
which  is  not  real  is  not  of  much  use  for  quantum  mechanics.  We 
shall  therefore  confine  ourselves  to  real  linear  operators  for  the  further 
development  of  the  theory.  Putting  for  <x  the  real  linear  operator  f, 
we  have  instead  of  equations  (10)  and  (11) 

f [P>  = a|P>,  (12) 

<Q\Z  = b(Q,\.  (13) 

t The  word  4 proper  * is  sometimes  used  instead  of 4 eigen  \ but  this  is  not  satisfactory 
as  the  words  'proper’  and  ‘improper’  am  often  used  with  other  meanings.  For  example, 
in  §§1.5  and  46  the  words  ‘improper  function’  and  ‘proper-energy’  are  used. 


EIGENVALUES  AND  EIGENVECTORS 


31 


§9 


Three  important  results  can.  now  be  readily  deduced. 

(i)  The  eigenvalues  are  all  real  numbers.  To  prove  that  a satisfying 
(L2)  is  real,  we  multiply  (12)  by  the  bra  <P|  on  the  left,  obtaining 

<P|£|P>  = a<PjP>. 

Now  from  equation  (4)  with  <Pj  replaced  by  <P|  and  a replaced  by 
the  real  linear  operator  £,  we  see  that  the  number  <P|£|P>  must  be 
real,  and  from  (8)  of  § 6,  <P|P>  must  be  real  and  not  zero.  Hence  a 
is  real.  Similarly,  by  multiplying  (13)  by  \Q>  on  the  right,  we  can 
prove  that  b is  real. 

Suppose  we  have  a solution  of  (12)  and  we  form  the  conjugate 
imaginary  equation,  which  will  read 

<P|£  = <P\ 

in  view  of  the  reality  of  f and  a.  This  conjugate  imaginary  equation 
now  provides  a solution  of  (13),  with  j = <(P|  and  b — a.  Thus 
we  can  infer 

(ii)  The  eigenvalues  associated  with  eigenkeis  are  the  same  as  the 
eigenvalues  associated  with  eigenbras. 

(iii)  The  conjugate  imaginary  of  any  eigenket  is  an  eigenbra  belonging 
to  the  same  eigenvalue,  and  conversely.  This  last  result  makes  it  reason- 
able to  call  the  state  corresponding  to  any  eigenket  or  to  the  conjugate 
imaginary  eigenbra  an  eigenstate  of  the  real  dynamical  variable  f. 

Eigenvalues  and  eigenvectors  of  various  real  dynamical  variables 
are  used  very  extensively  in  quantum  mechanics,  so  it  is  desirable 
to  have  some  systematic  notation  for  labelling  them.  The  following 
is  suitable  for  most  purposes.  If  f is  a real  dynamical  variable,  we 
call  its  eigenvalues  £',  £*,  |r,  etc.  Thus  we  have  a letter  by  itself 
denoting  a real  dynamical  variable  or  a real  linear  operator,  and  the 
same  letter  with  primes  or  an  index  attached  denoting  a number, 
namely  an  eigenvalue  of  what  the  letter  by  itself  denotes.  An  eigen- 
vector may  now  be  labelled  by  the  eigenvalue  to  which  it  belongs. 
Thus  IO  denotes  an  eigenket  belonging  to  the  eigenvalue  f of  the 
dynamical  variable  |.  If  in  a piece  of  work  we  deal  with  more  than 
one  eigenket  belonging  to  the  same  eigenvalue  of  a dynamical  variable, 
we  may  distinguish  them  one  from  another  by  means  of  a further 
label,  or  possibly  of  more  than  one  further  labels.  Thus,  if  we  are 
dealing  with  two  eigenkets  belonging  to  the  same  eigenvalue  of  f, 
we  may  call  them  |f'l>  and  j£'2>. 
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Theorem.  Two  eigenvectors  of  a real  dynamical  variable  belonging 
to  different  eigenvalues  are  orthogonal. 

To  prove  the  theorem,  let  |£'>  and  |£">  be  two  eigenkets  of  the  real 
dynamical  variable  £,  belonging  to  the  eigenvalues  £'  and  £*  respec- 
tively. Then  we  have  the  equations 

fio  = no, 
fir>  = rir>. 

Taking  the  conjugate  imaginary  of  (14),  we  get 

<e\£  = e<n 

Multiplying  this  by  |£'>  on  the  right  gives 

<f  \t\n  = e<? r> 

and  multiplying  (15)  by  <f  | on  the  left  gives 

<f  Ifl  D = 

Hence,  subtracting,  (£'— £*)<£' = 0, 

showing  that,  if  f ^ £",  <f  |f>  = 0 and  the  two  eigenvectors  |f  > 
and  |f>  are  orthogonal.  This  theorem  will  be  referred  to  as  the 
orthogonality  theorem. 

We  have  been  discussing  properties  of  the  eigenvalues  and  eigen- 
vectors of  a real  linear  operator,  but  have  not  yet  considered  the 
question  of  whether,  for  a given  real  linear  operator,  any  eigenvalues 
and  eigenvectors  exist,  and  if  so,  how  to  find  them.  This  question 
is  in  general  very  difficult  to  answer.  There  is  one  useful  special  case, 
however,  which  is  quite  tractable,  namely  when  the  real  linear 
operator,  £ say,  satisfies  an  algebraic  equation 

m ^ £n+at £n~'+a2 £»-*+...+«„  = 0,  (17) 

the  coefficients  a being  numbers.  This  equation  means,  of  course, 
that  the  linear  operator  <f>(£)  produces  the  result  zero  when  applied 
to  any  ket  vector  or  to  any  bra  vector. 

Let  (17)  be  the  simplest  algebraic  equation  that  £ satisfies.  Then 
it  will  be  shown  that 


(14) 

(16) 


(a)  The  number  of  eigenvalues  of  £ is  n. 

(/?)  There  are  so  many  eigenkets  of  £ that  any  ket  whatever  can 
be  expressed  as  a sum  of  such  eigenkets. 

The  algebraic  form  <f>(£)  can  be  factorized  into  n linear  factors,  the 
result  being 

<}>(£)  (£~c1)(£-cs)(£~cs)...(£~cn) 


(18) 
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say,  the  c’s  being  numbers,  not  assumed  to  be  all  different.  This 
factorization  can  be  performed  with  £ a linear  operator  just  as  well 
as  with  £ an  ordinary  algebraic  variable,  since  there  is  nothing 
occurring  in  (18)  that  does  not  commute  with  £.  Let  the  quotient 
when  <f>(£)  is  divided  by  (£— cr)  be  xM)>  so  that 

m — (f~cr)Xr(£)  (»■=  1,2,3,..,,*). 

Then,  for  any  ket  |P>, 

(t-cr)xA£)\P>  = M)\P>  = 0.  (19) 

Now  yr(|)|P>  cannot  vanish  for  every  ket  |P>,  as  otherwise  yr(£) 
itself  would  vanish  and  we  should  have  £ satisfying  an  algebraic 
equation  of  degree  n — 1,  which  would  contradict  the  assumption  that 
(17)  is  the  simplest  equation  that  £ satisfies.  If  we  choose  |P>  so  that 
Xr{i) |P>  does  not  vanish,  then  equation  (19)  shows  that  yr(£)|P>  is 
an  eigenket  of  £ , belonging  to  the  eigenvalue  cr.  The  argument  holds 
for  each  value  of  r from  1 to  n,  and  hence  each  of  the  c’s  is  an  eigen- 
value of  £.  No  other  number  can  be  an  eigenvalue  of  £,  since  if  £'  is 
any  eigenvalue,  belonging  to  an  eigenket  |f')> 

£if  > = f If) 

and  we  can  deduce  <f>(£)\£'y  --  4>{£) 

and  since  the  left-hand  side  vanishes  we  must  have  f{£')  — 0. 

To  complete  the  proof  of  (a)  we  must  verify  that  the  c’s  are  all 
different.  Suppose  the  c’s  are  not  all  different  and  cs  occurs  m times 
say,  with  to  > 1.  Then  6(£)  is  of  the  form 

m — (f-e.)-»(f), 

with  6(£)  a rational  integral  function  of  £.  Equation  (17)  now'  gives  us 

(f-c,)m0(!)b4>  = 0 (20) 

for  any  ket  |4>.  Since  cs  is  an  eigenvalue  of  £ it  must  be  real,  so  that 
£—cs  is  a real  linear  operator.  Equation  (20)  is  now  of  the  same  form 
as  equation  (8)  with  £— cs  for  £ and  6(£) \A)  for  j P>.  From  the  theorem 
connected  with  equation  (8)  we  can  infer  that 

(£-c.M0|il>  = o. 

Since  the  ket  \A)  is  arbitrary, 

= 0, 

rwhich  contradicts  the  assumption  that  (17)  is  the  simplest  equation 
™hat  £ satisfies.  Hence  the  c’s  are  all  different  and  (a)  is  proved. 

Let  Xr(cr)  be  the  number  obtained  when  cr  is  substituted  for  £ in 
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the  algebraic  expression  Xr(£)-  Since  the  c’s  are  all  different,  Xr(cr) 
cannot  vanish.  .Consider  now  the  expression 

y M)  , (21) 

r Xr^  , ] 

If  c,  is  substituted  for  £ here,  every  term  inS^Vsum  vanishes  except 
the  one  for  which  r = s,  since  xr(£)  contains  (£— cs)  as  a factor  when 
r ^ s,  and  the  term  for  which  r — a is  unity,  so  the  whole  expression 
vanishes.  Thus  the  expression  (21)  vanishes  when  £ is  put  equal  to 
any  of  the  n numbers  cvct,...,cn.  Since,  however,  the  expression 
is  only  of  degree  n—  1 in  £,  it  must  varnish  identically.  If  we  now 
apply  the  linear  operator  (21)  to  an  arbitrary  ket  |P>  and  equate 
the  result  to  zero,  we  get 

|P> " 2 d;3*'({)iP>-  ,22) 

Each  term  in  the  sum  on  the  right  here  is,  according  to  (19),  an 
eigenket  of  £,  if  it  does  not  vanish.  Equation  (22)  thus  expresses  the 
arbitrary  ket  |P>  as  a sum  of  eigenkets  of  f,  and  thus  (jS)  is  proved. 

As  a simple  example  we  may  consider  a real  linear  operator  a that 
satisfies  the  equation  — 1 (23) 

Then  a has  the  two  eigenvalues  1 and  —1.  Any  ket  |P>  can  be 
expressed  as  )p>  = |(l+ff)|P>+i(l-a)|P>. 

It  is  easily  verified  that  the  two  terms  on  the  right  here  are  eigenkets 
of  a,  belonging  to  the  eigenvalues  1 and  — 1 respectively,  when  they 
do  not  vanish. 


10.  Observables 

We  have  made  a number  of  assumptions  about  the  way  in  which 
states  and  dynamical  variables  are  to  be  represented  mathematically 
in  the  theory.  These  assumptions  are  not,  by  themselves,  laws  of 
nature,  but  become  laws  of  nature  when  we  make  some  further 
assumptions  that  provide  a physical  interpretation  of  the  theory. 
Such  further  assumptions  must  take  the  form  of  establishing  con- 
nexions between  the  results  of  observations,  on  one  hand,  and  the 
equations  of  the  mathematical  formalism  on  the  other. 

When  we  make  an  observation  we  measure  some  dynamical  variable. 
It  is  obvious  physically  that  the  result  of  such  a measurement  must 
always  be  a real  number,  so  we  should  expect  that  any  dynamical 
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variable  that  we  can  measure  must  be  a real  dynamical  variable. 
One  might  think  one  could  measure  a complex  dynamical  variable 
by  measuring  separately  its  real  and  pure  imaginary  parts.  But  this 
would  involve  two  measurements  or  two  observations,  which  would 
be  all  right  in  classical  mechanics,  but  would  not  do  in  quantum 
mechanics,  where  two  observations  in  general  interfere  with  one 
another — it  is  not  in  general  permissible  to  consider  that  two  observa- 
tions can  be  made  exactly  simultaneously,  and  if  they  are  made  in 
quick  succession  the  first  will  usually  disturb  the  state  of  the  system 
and  introduce  an  indeterminacy  that  will  affect  the  second.  We 
therefore  have  to  restrict  the  dynamical  variables  that  we  can 
measure  to  be  real,  the  condition  for  this  in  quantum  mechanics 
being  as  given  in  § 8.  Not  every  real  dynamical  variable  can  be 
measured,  however.  A further  restriction  is  needed,  as  we  shall  see 
later. 

We  now  make  some  assumptions  for  the  physical  interpretation  of 
the  theory.  If  the  dynamical  system  is  in  an  eigenstate  of  a real 
dynamical  variable  £,  belonging  to  the  eigenvalue  , then  a measurement 
of  i;  will  certainly  give  as  result  the  number  Conversely,  if  the  system 
is  in  a state  such  that  a measurement  of  a real  dynamical  variable  £ is 
certain  to  give  one  particular  result  (instead  of  giving  one  or  other  of 
several  possible  results  according  to  a probability  law,  as  is  in  general 
the  case),  then  the  state  is  an  eigenstate  of  t;  and  the  result  of  the  measure- 
ment is  the  eigenvalue  of  £ to  which  this  eigenstate  belongs.  These 
assumptions  are  reasonable  on  account  of  the  eigenvalues  of  real 
linear  operators  being  always  real  numbers. 

Some  of  the  immediate  consequences  of  the  assumptions  will  be 
noted.  If  we  have  two  or  more  eigenstates  of  a real  dynamical 
variable  f belonging  to  the  same  eigenvalue  then  any  state 
formed  by  superposition  of  them  will  also  be  an  eigenstate  of  f 
belonging  to  the  eigenvalue  We  can  infer  that  if  we  have  two  or 
more  states  for  which  a measurement  of  £ is  certain  to  give  the  result 
£',  then  for  any  state  formed  by  superposition  of  them  a measurement 
of  £ will  still  be  certain  to  give  the  result  This  gives  us  some  insight 
into  the  physical  significance  of  superposition  of  states.  Again,  two 
eigenstates  of  £ belonging  to  different  eigenvalues  are  orthogonal. 
We  can  infer  that  two  states  for  which  a measurement  of  £ is  certain 
to  give  two  different  results  are  orthogonal.  This  gives  us  some 
insight  into  the  physical  significance  of  orthogonal  states. 
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When  we  measure  a real  dynamical  variable  £,  the  disturbance 
involved  in  the  act  of  measurement  causes  a jump  in  the  state  of  the 
dynamical  system.  From  physical  continuity,  if  we  make  a second 
measurement  of  the  same  dynamical  variable  $ immediately  after 
the  first,  the  result  of  the  second  measurement  must  be  the  same  as 
that  of  the  first.  Thus  after  the  first  measurement  has  been  made, 
there  is  no  indeterminacy  in  the  result  of  the  second.  Hence,  after 
the  first  measurement  has  been  made,  the  system  is  in  an  eigenstate 
of  the  dynamical  variable  the  eigenvalue  it  belongs  to  being  equal 
to  the  result  of  the  first  measurement.  This  conclusion  must  still  hold 
if  the  second  measurement  is  not  actually  made.  In  this  way  we  see 
that  a measurement  always  causes  the  system  to  jump  into  an  eigen- 
state of  the  dynamical  variable  that  is  being  measured,  the  eigenvalue 
this  eigenstate  belongs  to  being  equal  to  the  result  of  the  measure- 
ment. 

We  can  infer  that,  with  the  dynamical  system  in  any  state,  any 
result  of  a measurement  of  a real  dynamical  variable  is  one  of  its  eigen- 
values. Conversely,  every  eigenvalue  is  a possible  result  of  a measure- 
ment of  the  dynamical  variable  for  some  state  of  the  system,  since  it  is 
certainly  the  result  if  the  state  is  an  eigenstate  belonging  to  this 
eigenvalue.  This  gives  us  the  physical  significance  of  eigenvalues. 
The  set  of  eigenvalues  of  a real  dynamical  variable  are  just  the 
possible  results  of  measurements  of  that  dynamical  variable  and  the 
calculation  of  eigenvalues  is  for  this  reason  an  important  problem. 

Another  assumption  we  make  connected  with  the  physical  inter- 
pretation of  the  theory  is  that,  if  a certain  real  dynamical  variable 
£ is  measured  with  the  system  in  a particular  state,  the  states  into  which 
the  system  may  jump  on  account  of  the  measurement  are  such  that  the 
original  state  is  dependent  on  them.  Now  these  states  into  which 
the  system  may  jump  are  all  eigenstates  of  and  hence  the  original 
state  is  dependent  on  eigenstates  of  £.  But  the  original  state  may  be 
any  state,  so  we  can  conclude  that  any  state  is  dependent  on  eigen- 
states of  (■.  If  we  define  a complete  set  of  states  to  be  a set  such  that 
any  state  is  dependent  on  them,  then  our  conclusion  can  be  formu- 
lated— the  eigenstates  of  f form  a complete  set. 

Not  every  real  dynamical  variable  has  sufficient  eigenstates  to  form 
a complete  set.  Those  whose  eigenstates  do  not  form  complete  sets 
are  not  quantities  that  can  be  measured.  We  obtain  in  this  way  a 
further  condition  that  a dynamical  variable  has  to  satisfy  in  order 
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that  it  shall  be  susceptible  to  measurement,  in  addition  to  the  con- 
dition that  it  shall  be  real.  We  call  a real  dynamical  variable  whose 
eigenstates  form  a complete  set  an  observable.  Thus  any  quantity 
that  can  be  measured  is  an  observable. 

The  question  now  presents  itself— Can  every  observable  be  ^ 
measured?  The  answer  theoretically  is  yes.  In  practice  it  may  he  | 
very  awkward,  or  perhaps  even  beyond  the  ingenuity  of  the  experi-  f 
menter,  to  devise  an  apparatus  which  could  measure  some  particular  | 
observable,  but  the  theory  always  allows  one  to  imagine  that  the 
measurement  can  be  made. 

Let  us  examine  mathematically  the  condition  for  a real  dynamical 
variable  £ to  be  an  observable.  Its  eigenvalues  may  consist  of  a 
(finite  or  infinite)  discrete  set  of  numbers,  or  alternatively,  they 
may  consist  of  all  numbers  in  a certain  range,  such  as  all  numbers 
lying  between  a and  b.  In  the  former  case,  the  condition  that 
any  state  is  dependent  on  eigenstates  of  £ is  that  any  ket  can 
be  expressed  as  a sum  of  eigenkets  of  £.  In  the  latter  case  the 
condition  needs  modification,  since  one  may  have  an  integral  instead 
of  a sum,  i.e.  a ket  |P>  may  be  expressible  as  an  integral  of  eigen- 


kets  of  £,  |P>  = JjOd£\  . (24) 

|f>  being  an  eigenket  of  £ belonging  to  the  eigenvalue  £'  and  the 
range  of  integration  being  the  range  of  eigenvalues,  as  such  a ket  is 
dependent  on  eigenkets  of  £.  Notevery  ket  dependent  on  eigenkets 
of  £ can  be  expressed  in  the  form  of  the  right-hand  side  of  (24),  since 
one  of  the  eigenkets  itself  cannot,  and  more  generally  any  sum  of 
eigenkets  cannot.  The  condition  for  the  eigenstates  of  £ to  form  a 
complete  set  must  thus  be  formulated,  that  any  ket  \P)  can  be 
expressed  as  an  integral  plus  a sum  of  eigenkets  of  £,  i.e. 

|P>  = J |£'c>  d£'+  1 1 ed>,  (25) 

where  the  |£'c>,  | £rd>  are  all  eigenkets  of  £,  the  labels  c and  d being 
inserted  to  distinguish  them  when  the  eigenvalues  £'  and  £r  are  eflual, 
and  where  the  integral  is  taken  over  the  whole  range  of  eigenvalues 
and  the  sum  is  taken  over  any  selection  of  them.  If  this  condition 
is  satisfied  in  the  case  when  the  eigenvalues  of  £ consist  of  a range 
of  numbers,  then  £ is  an  observable. 

There  is  a more  general  case  that  sometimes  occurs,  namely  the 
eigenvalues  of  £ may  consist  of  a range  of  numbers  together  with  a 
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discrete  set  of  numbers  lying  outside  the  range.  In  this  case  the 
condition  that  f shah  be  an  observable  is  stilHLat  any  keti  be 

°f « *>« " 

ZrZ  * 0f ' - — - * 

It  is  often  very  difficult  to  decide  mathematically  whether  a nar 
lcular  real  dynamical  variable  satisfies  the  condition  for  beimfan 

and 

for  believing 

TT*1  Tfle  an  4u»L„.lS 

vatiaMet  2 ZZjTtZ  TT  *7  “ «"  -W—l 

eve?  ££  Z‘ StI  £*  “d  ‘T  “ Wd»  *»  - V l*>  what- 
« — - °f  t being  an  . °‘  °f  ‘ 

(f-ext-n-a-n  - ».  (26, 

As  an  example  we  may  consider  the  linear  operator  MVi  i 
*4>  is  a normalized  kot  Thin  > where 

and  its  equate  i “ Wr  °I*r,,OT  b «“rdmg  to  (7), 

{\AyA[)>  _ IMUXJI  _ MX/1|  (27) 
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since  <A|A>  = 1.  Thus  its  square  equals  itself  and  so  it  satisfies  an 
algebraic  equation  and  is  an  observable.  Its  eigenvalues  are  1 and  0, 
with  |A>  as  the  eigenket  belonging  to  the  eigenvalue  1 and  all  kets 
orthogonal  to  |A>  as  eigenkets  belonging  to  the  eigenvalue  0.  A 
measurement  of  the  observable  thus  certainly  gives  the  result  1 if 
the  dynamical  system  is  in  the  state  corresponding  to  |A>  and  the 
result  0 if  the  system  is  in  any  orthogonal  state,  so  the  observable 
may  be  described  as  the  quantity  which  determines  whether  the 
system  is  in  the  state  I A)  or  not. 

Before  concluding  this  section  we  should  examine  the  conditions 
for  an  integral  such  as  occurs  in  (24)  to  be  significant.  Suppose  jX) 
and  | 7)  are  two  kets  which  can  be  expressed  as  integrals  of  eigenkets 
of  the  observable 


|X>  = J |fa?>  df , |r>  = J|f>>«*r. 

x and  y being  used  as  labels  to  distinguish  the  two  integrands.  Then 
we  have,  taking  the  conjugate  imaginary  of  the  first  equation  and 
multiplying  by  the  second 


<X|7>  = JJ  <f*| Vy>  dm"-  (2S) 


Consider  now  the  single  integral 


f <f'*| £'y>  df.  (29) 

From  the  orthogonality  theorem,  the  integrand  here  must  vanish 
over  the  whole  range)  of  integration,  except  the  one  point  — $ ■ 
If  the  integrand  is  finite  at  this  point,  the  integral  (29)  vanishes,  and 
if  this  holds  for  all  f , we  get  from  (28)  that  <X|  T>  vanishes.  Now 
in  general  <X|7>  does  not  vanish,  so  in  general  <far| £'y>  must  be 
infinitely  great  in  such  a way  as  to  make  (29)  non-vanishing  and 
finite.  The  form  of  infinity  required  for  this  will  be  discussed  in  § 15. 

In  our  work  up  to  the  present  it  has  been  implied  that  our  lira  and 
ket  vectors  are  of  finite  length  and  their  scalar  products  are  finite. 
We  see  now  the  need  for  relaxing  this  condition  when  we  are  dealing 
with  eigenvectors  of  ala  observable  whose  eigenvalues  form  a lunge. 
If  we  did  not  relax  it,  the  phenomenon  of  ranges  of  eigenvalue  enuid 
not" occur  and  our  theory  would  be  too  weak  for  most  . cst 
problems. 


§ 10 
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Taking  I Y)  = |Z>  above,  we  get  the  result  that  in  general  <('* 
is  infinitely  great.  We  shall  assume  that  if  \?x>  -/{>  ^ > 

f<£'*ir*>dr>  o,  (30) 

length  aXi°m  COrresp°nding  to  (*)  of  § 6 for  vectors  of  infinite 


he  space  of  bra  or  ket  vectors  when  the  vectors  are  restricted  to 
be  of  finite  length  and  to  have  finite  scalar  products  is  called  by 
mathematicians  a Hilbert  space.  The  bra  and  ket  vectors  that  we 
now  use  form  a more  general  space  than  a Hilbert  space 

rig^TuTs^e  of  °f  a ket  'P>  * the  form  of  the 

gut  hand  side  of  (25)  is  unique,  provided  there  are  not  two  or  more 

terms  in  tne  sum  referring  to  the  same  eigenvalue.  To  pJe  tl 

u suit,  let  us  suppose  that  two  different  expansions  of  |P>  are  pos- 

of  the  form  7 ^ &°m  the  other*  8*  an  equation 

0 - / lf«>  d?  + 2 Ifft),  (31) 

a and  b being  used  as  new  labels  for  the  eigenvectors  and  the  sum 

w*  •*«  o. it  fZ 

value  A ! T !S  a te,m  1U  the  SUm  in  <31)  referring  to  an  eigen- 

not.m  tho  range>  we  get,  by  multiplying  (31)  on  the  left  bv 
<£6|  and  using  the  orthogonality  theorem, 

0 = <*W>, 

which  contradicts  (8)  of  § 6.  Again,  if  the  integrand  in  (31)  does  not 
vanish  for  some  eigenvalue  f not  equal  to  any  P occurring  the 

sum,  we  get,  by  multiplying  (31)  on  the  left  by  <fo I and  uSng  the 
orthogonality  theorem,  ' g the 

0 = | <r«!fa>  df, 

wMoh  contradicts  (30,.  Finally,  if  there  ia  a term  in  th.enn,  i»  (31) 

S^y  6 e in  ‘he  nmge' get'  PC< « 

0 +<et>\?b>  (32) 

and  multiplying  (31 ) on  the  left  by  | 

0 = / dr  +<fa[f&>.  (33) 

Now  th.  integral  in  (33)  i,  finite,  ao  <W>  is  SnSte  and  <?5|?0> 
fimte.  The  mtegnd  m (32)  must  then  be  zero,  eo  <{<i,|f6>  iZ„  a„d 
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we  again  have  a contradiction.  Thus  every  term  in  (31)  must  vanish 
and  the  expansion  of  a ket  |P>  in  the  form  of  the  right-hand  side  of 
(25)  must  be  unique. 


11.  Functions  of  observables 

Let  f be  an  observable.  We  can  multiply  it  by  any  real  number  k 
and  get  another  observable  kg.  In  order  that  our  theory  may  be 
self-consistent  it  is  necessary  that,  when  the  system  is  in  a state  such 
that  a measurement  of  the  observable  f certainly  gives  the  result  f', 
a measurement  of  the  observable  kg  shall  certainly  give  the  result  kg'. 
It  is  easily  verified  that  this  condition  is  fulfilled.  The  ket  correspond- 
ing to  a state  for  which  a measurement  of  g certainly  gives  the  result 
g'  is  an  eigenket  of  g,  |f'>  say,  satisfying 

flf  > = f lf>- 

This  equation  leads  to 

kg\r>  = kg'\g’>, 

showing  that  |f'>  is  an  eigenket  of  kg  belonging  to  the  eigenvalue  kg', 
and  thus  that  a measurement  of  kg  will  certainly  give  the  result  kg'. 

More  generally,  we  may  take  any  real  function  of  g,  /(g)  say,  and 
consider  it  as  a new  observable  which  is  automatically  measured 
whenever  g is  measured,  since  an  experimental  determination  of  the 
value  of  g also  provides  the  value  of  f(g).  We  need  not  restrict /(f)  to 
be  real,  and  then  its  real  and  pure  imaginary  parts  are  two  observables 
which  are  automatically  measured  when  g is  measured.  For  t he  theory 
to  be  consistent  it  is  necessary  that,  when  the  system  is  in  a state 
such  that  a measurement  of  g certainly  gives  the  result  g',  a measure- 
ment of  the  real  and  pure  imaginary  parts  of /(f)  shall  certainly  give 
for  results  the  real  and  pure  imaginary  parts  of/(f ').  In  the  case  when 
/(f)  is  expressible  as  a power  series 

M)  = Co+Cjf+CafHc-f’-K.., 

the  c’s  being  numbers,  this  condition  can  again  be  verified  by  elemen- 
tary algebra.  In  the  case  of  more  general  functions  / it  may  not  be 
possible  to  verify  the  condition.  The  condition  may  then  be  used  to 
define  /(f),  which  we  have  not  yet  defined  mathematically.  In  this 
way  we  can  get  a more  general  definition  of  a function  of  an  observ- 
able than  is  provided  by  power  series. 

We  define /(f)  in  general  to  be  that  linear  operator  which  satisfies 

/(f)  in  =/(mo  (34) 
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for  every  eigenket  |f  '>  Of  f,/(f')  being  a number  for  each  eigenvalue  f'. 
It  is  easily  seen  that  this  definition  is  self-consistent  when  applied  to 
eigen kets  If  > that  are  not  independent.  If  we  have  an  eigenket  \£Ay 
dependent  on  other  eigenkets  of  f,  these  other  eigenkets  must  all 
belong  to  the  same  eigenvalue  f',  otherwise  we  should  have  an  equa- 
tion of  the  type  (31),  which  we  have  seen  is  impossible.  Onmultiplying 
the  equation  which  expresses  |f'A>  linearly  in  terms  of  the  other 
eigenkets  of  f by  /(f)  on  the  left,  we  merely  multiply  each  term  in  it 
by  the  number  /( f '),  so  we  obviously  get  a consistent  equation. 
Further,  equation  (34)  is  sufficient  to  define  the  linear  operator  /(f) 
completely,  since  to  get  the  result  of /(f)  multiplied  into  an  arbitrary 
ket  |P>,  we  have  only  to  expand  |P>  in  the  form  of  the  right-hand 
side  of  (26)  and  take 

/ 

M)\ P>  = j m !f'c>  df ' ' + lf(£)  If rd>.  (35) 

. TLe  conJ  :,yate  complex  /(f)  of /(f)  is  defined  by  the  conjugate 
imaginary  equation  to  (34),  namely 

<fiM=/(n<n 

holding  for  any  eigenbra  <f'|,  /( f')  being  the  conjugate  complex 
function  to  /( f').  Let  us  replace  f'  here  by  f'  and  multiply  the 
equation  on  the  right  by  the  arbitrary  ket  |P>.  Then  we  get,  using 
the  expansion  (25)  for  | P>, 

= /(f)<f*|P> 

= j7(n<f'if'c>  d£  + 2/(f')<f'ifu> 

“ f /(f'xnfc)  df'  +/(f'Kf'jfV>  (36) 

with  the  help  of  the  orthogonality  theorem,  (fif'd)  being  under- 
stood to  be  zero  if  f"  is  not  one  of  the  eigenvalues  to  which  the  terms 
in  the  sum  in  (25)  refer.  Again,  putting  the  conjugate  complex 
function  /( f')  for  f(£)  in  (35)  and  multiplying  on  the  left  by  <f'l 
we  get  ’ 

<n/(f)!P>  - |/(f'xnf'c>  df  +/(f'xf'if'd>. 

The  right-hand  side  here  equals  that  of  (36),  since  the  integrands 
vanish  for  f'  f',  and  hence 

<nMiP>  = <n/(f)ip>. 
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This  holds  for  <£'|  any  eigenbra  and  |P>  any  ket,  so 

M “/(«• 

Thus  the  conjugate  complex  of  the  linear  operator  /(f)  is  the  conjugate 

complex  function  J of  f.  . . 

It  follows  as  a corollary  that  if /(f')  is  a real  function  of  t , /(f)  is 
a real  linear  operator,  /(f)  is  then  also  an  observable.,  since  its 
eigenstates  form  a complete  set,  every  eigenstate  of  f being  also  an 
eigenstate  of /(f). 

With  the  above  definition  we  are  able  to  give  a meaning  to  any 
function  f of  an  observable , provided  only  that  the  domain  of  existence 
of  the  function  of  a real  variable  fix)  includes  all  the  eigenvalues  of  the 
observable.  If  the  domain  of  existence  contains  other  points  besides 
these  eigenvalues,  then  the  values  of  /(*)  for  these  other  points  will 
not  affect  the  function  of  the  observable.  The  function  need  not  be 
analytic  or  continuous.  The  eigenvalues  of  a function  / of  an  observ- 
able are  just  the  function  / of  the  eigenvalues  of  the  observable. 

It  is  important  to  observe  that  the  possibility  of  defining  a function 
/ of  an  observable  requires  the  existence  of  a unique  number  f(x)  for 
each  value  of  x which  is  an  eigenvalue  of  the  observable.  Thus  the 
function /(x)  must  be  single-valued.  This  may  be  illustrated  by  con- 
sidering the  question:  When  we  have  an  observable  f(A)  which  is  a 
real  function  of  the  observable  A,  is  the  observable  A a function  of 
the  observable/^)  ? The  answer  to  this  is  yes,  if  different  eigenvalues 
A'  of  A always  lead  to  different  values  off(A’).  If,  however,  there 
exist  two  different  eigenvalues  of  A,  A'  aqd  A"  say,  such  that 
f(A')  =f(A"),  then,  corresponding  to  the  eigenvalue  f(A')  of  the 
observable  f{A),  there  will  not  be  a unique  eigenvalue  of  the  observ- 
able A and  the  latter  will  not.  be  a function  of  the  observable  f(A). 

It  may  easily  be  verified  mathematically,  from  the  definition,  that 
the  sum  or  product  of  two  functions  of  an  observable  is  a function 
of  that  observable  and  that  a function  of  a function  of  an  observable 
is  a function  of  that  observable.  Also  it  is  easily  seen  that  the  whole 
theory  of  functions  of  an  observable  is  symmetrical  between  bras  and 
kets  and  that  we  could  equally  well  work  from  the  equation 

<iw)=/(n<n  <38> 

instead  of  from  (34). 

We  shall  conclude  this  section  with  a discussion  of  two  examples 
which  are  of  great  practical  importance,  namely  the  reciprocal  and 
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the  square  root.  The  reciprocal  of  an  observable  exists  if  the  observ- 
able does  not  have  the  eigenvalue  zero.  If  the  observable  a does  not 
have  the  eigenvalue  zero,  the  reciprocal  observable,  which  we  call  a-1 
or  1/a,  will  satisfy  , , .. 

*1\<x>  = a (39) 
where  |a'>  is  an  eigenket  of  a belonging  to  the  eigenvalue  a'.  Hence 
aa_1ja'>  = aa'^ja')  = |a'>. 

Since  this  holds  for  any  eigenket  |a'>,  we  must  have 

aa-1  = 1.  (40) 

Similarly,  a-ia  = i.  (4J) 

Either  of  these  equations  is  sufficient  to  determine  a-1  completely, 
provided  a does  not  have  the  eigenvalue  zero.  To  prove  this  in  the 
case  of  (40),  let  x be  any  linear  operator  satisfying  the  equation 

ax  = 1 

and  multiply  both  sides  on  the  left  by  the  a-1  defined  by  (39).  The 
result  is 


a 1ax  = a 


-l 


and  hence  from  (41)  x — a.- 

Equations  (40)  and  (41)  can  be  used  to  define  the  reciprocal,  when 
it  exists,  of  a general  linear  operator  a,  which  need  not  even  be  real. 
One  of  these  equations  by.  itself  is  then  not  necessarily  sufficient.  If 
any  two  linear  operators  a and  ft  have  reciprocals,  their  product  afi 
has  the  reciprocal  ..  B 

(<*P)  1 = P-1*-1,  (42) 

obtained  by  taking  the  reciprocal  of  each  factor  and  reversing  their 
order.  We  verify  (42)  by  noting  that  its  right-hand  side  gives  unity 
when  multiplied  by  ap,  either  on  the  right  or  on  the  left.  This  reci- 
procal law  for  products  can  be  immediately  extended  to  more  than 
two  factors,  i.e.,  . 0 . , , 

(aPy...)-1  = ...y-'p-'a-K 

The  square  root  of  an  observable  a always  exists,  and  is  real  if  a 
has  no  negative  eigenvalues.  We  write  it  Va  or  a*.  It  satisfies 

Va|a'>  = ±Va'|a'>,  (43) 

|a'>  being  an  eigenket  of  a belonging  to  the  eigenvalue  a.  Hence 
VaVa|a'>  = Va'Va'|a'>  = a'|a'>  = a|a'>, 
and  since  this  holds  for  any  eigenkst  |a'>  we  must  have 

VaVa  = a. 


(44) 
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On  account  of  the  ambiguity  of  sign  in  (43)  there  will  be  several 
square  roots.  To  fix  one  of  them  we  must  specify  a particular  sign 
in  (43)  for  each  eigenvalue.  This  sign  may  vary  irregularly  from  one 
eigenvalue  to  the  next  and  equation  (43)  will  always  define  a linear 
operator  Va  satisfying  (44)  and  forming  a square-root  function  of  a. 
If  there  is  an  eigenvalue  of  a with  two  or  more  independent  eigenkets 
belonging  to  it,  then  we  must,  according  to  our  definition  of  a func- 
tion, have  the  same  sign  in  (43)  for  each  of  these  eigenkets.  If  we 
took  different  signs,  however,  equation  (44)  would  still  hold,  and  hence 
equation  (44)  by  itself  is  not  sufficient  to  define  Va,  except  in  the 
special  case  when  there  is  only  one  independent  eigenket  of  a belong- 
ing to  any  eigenvalue. 

The  number  of  different  square  roots  of  an  observable  is  2n,  where 
n is  the  total  number  of  eigenvalues  not  zero.  In  practice  the  square- 
root  function  is  used  only  for  observables  without  negative  eigen- 
values and  the  particular  square  root  that  is  useful  is  the  one  for 
which  the  positive  sign  is  always  taken  in  (43).  This  one  will  be  called 
the  positive  square  root. 


12.  The  general  physical  interpretation 

The  assumptions  that  we  made  at  the  beginning  of  § 10  to  get  a 
* physical  interpretation  of  the  mathematical  theory  are  of  a rather 
special  kind,  since  they  can  be  used  only  in  connexion  with  eigen- 
states. We  need  some  more  general  assumption  which  will  enable  us 
to  extract  physical  information  from  the  mathematics  even  when  we 
are  not  dealing  with  eigenstates. 

In  classical  mechanics  an  observable  always,  as  we  say,  ‘has  a 
value’  for  any  particular  state  of  the  system.  What  is  there  in  quan- 
tum mechanics  corresponding  to  this  ? If  we  take  any  observable  f 
and  any  two  states  x and  y,  corresponding  to  the  vectors  <x|  and  jy>, 
then  we  can  form  the  number  <x|£|y>.  This  number  is  not  very 
closely  analogous  to  the  value  which  an  observable  can  ‘have’  in  the 
classical  theory,  for  three  reasons,  namely,  (i)  it  refers  to  two  states 
of  the  system,  while  the  classical  value  always  refers  to  one,  (ii)  it  is 
in  general  not  a real  number,  and  (iii)  it  is  not  uniquely  determined 
by  the  observable  and  the  states,  since  the  vectors  <x|  and  |y>  contain 
arbitrary  numerical  factors.  Even  if  we  impose  on  <x|  and  |y>  the 
condition  that  they  shall  be  normalized,  there  will  still  be  an  undeter- 
mined factor  of  modulus  unity  in  <x|£jy>.  These  three  reasons  cease 
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to  apply,  however,  if  we  take  the  two  states  to  be  identical  and  |y> 
to  be  the  conjugate  imaginary  vector  to  (x\.  The  number  that  we 
then  get,  namely  <x|£|.e>,  is  necessarily  real,  and  also  it  is  uniquely 
determined  when  <x|  is  normalized,  since  if  we  multiply  <x|  by  the 
numerical  factor  eic,  c being  some  real  number,  we  must  multiply 
|*>  by  e-*0  and  <x|£ jx>  will  be  unaltered. 

One  might  thus  be  inclined  to  make  the  tentative  assumption  that 
the  observable  $ ‘has  the  value’  <(xj£|a:>  for  the  state  x,  in  a sense 
analogous  to  the  classical  sense.  This  would  not  he  satisfactory, 
though,  for  the  following  reason.  Let  us  take  a second  observable  ij, 
which  would  have  by  the  above  assumption  the  value  <x|ij|x>  for 
this  same  state.  We  should  then  expect,  from  classical  analogy,  that 
for  this  state  the  sum  of  the  two  observables  would  have  a value 
equal  to  the  sum  of  the  values  of  the  two  observables  separately  and 
the  product  of  the  two  observables  would  have  a value  equal  to  the 
product  of  the  values  of  the  two  observables  separately.  Ant.im.lly  the 
tentative- assumption  would  give  for  the  sum  of  the  two  observables 
the  value  <z|£+t)|x>,  which  is,  in  fact,  equal  to  the  sum  of  <x|f|a:> 
and  <a:|rj|a:>,  but  for  the  product  it  would  give  the  value  <x]fq|x> 
or  <x\r]£\x>,  neither  of  which  is  connected  in  any  simple  way  with 
<x|£|x>  and  <x|ij|x>. 

However,  since  things  go  wrong  only  with  the  product  and  not  with 
the  sum,  it  would  be  reasonable  to  call  (x\£ |x)>  the  average,  value  of 
the  observable  £ for  the  state  x.  This  is  because  the  average  of  the 
sum  of  two  quantities  must  equal  the  sum  of  their  averages,  but  the 
average  of  their  product  need  not  equal  the  product  of  their  averages, 
yto  therefore  make  the  general  assumption  that  if  the  measurement 
of  the  observable  £ for  the  system  in  the  state  corresponding  to  |x>  is 
made  a large  number  of  times,  the  average  of  all  the  remits  obtained  wiU 
be  <*|f  |*>,  provided  |x>  is  normalized.  If  |x>  is  not  normalized,  as  is 
necessarily  the  case  if  the  state  x is  an  eigenstate  of  some  observable 
belonging  to  an  eigenvalue  in  a range,  the  assumption  becomes  that 
the  average  result  of  a measurement  of  £ is  proportional  to  <x|f  jx>. 
This  general  assumption  provides  a basis  for  a general  physical  inter- 
pretation of  the  theory. 

The  expression  that  an  observable  ‘has  a particular  value’  for  a 
particular  state  is  permissible  in  quantum  mechanics  in  the  speoial 
case  when  a measurement  of  the  observable  is  certain  to  lead  to  the 
particular  value,  so  that  the  state  is  an  eigenstate  of  the  observable. 
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It  inay  easily  be  verified  from  the  algebra  that,  with  this  restricted 
meaning  for  an  obaervable  ■having  a value',  if  turn  ob5ervabl»  have 
values  for  a particular  state,  then  for  this  state  the  sum  of  the  two 

"I8  1 ^ “ °bservabIet)  has  a value  equal  to  the 

of  the  tl  k t ,tW°  °bservables  separately  and  the  product 
l + 1°  serva  es  ^ tbis  Product  is  an  observable!)  has  a value 
equal  to  the  product  of  the  values  of  the  two  observables  separately 
in  the  general  case  we  cannot  speak  of  an  observable  having  a value 
for  .par  .ouiar^,  but  we  can  speak  „f  its  having  an  average  itZ 
for  the  state.  We  can  go  further  and  speak  of  the  probability  of  iu 
havmg  any  specihed  value  for  the  state,  meaning  the  probabUity  of 
ta  speedied  value  being  obtained  when  on,  make®  a mlnrerZ  ol 
tte  observable.  Thw  probability  can  be  obtained  from  the  genial 
assumption  m the  following  way.  ° 

Xb!l b'  f “d,'et  **“  “Me  colTesPonrf  «o  the  normal- 
ket  !*>•  Thon  the  88”*ral  assumption  tells  us,  not  only  that  the 
verage  value  of  f ,s  <*|f|rc>,  but  also  that  the  average  vatae  of  any 
tatcbon  of  (,m  say,  is  «r|/,f,|*>.  Take/®  to  be  th'at 
whtoh  m equal  to  umty  when  f - a,  a being  some  real  number  and 
zero  otherwise.  This  function  of  f has  a meaning  accord L te  our 
general  theory  of  functions  of  an  observable,  and  fr  may  be  denoted 
two ta  m COnf°nnity  Wltb  the  Seneral  notation  of  the  symbol  S with 
gi;r  °n  P'  62  (equation  Tbe  average  value  of 

a $1B  JUSt  ^ Pr°babijity’  P«  say>  «f  f having  the  value 

P<*  ~ <*lSfc»l*>-  (45) 

IV9  m,\an  eigf value  of  multiplied  into  any  eigenket  of  f is 

zero,  and  hcnco  = 0 and  P o nnKio  * 

of  s in  iTrof  ( , r ~ °‘  llus  agrees  with  a conclusion 

one  of^Zr*  “ °f  “ observable  mue,  be 

ti  the  possible  results  of  a measurement  off  form  a range  of  num 
bers,  the  probability  of  f having  exactly  a particular  v.L  will  to 
0111  most  physical  problems.  The  quantity  of  physical  importance 

S tben  f Probability  of  f having  a value  with*  LJnZTZ 
om  a to  a+da.  This  probability,  which  we  may  call  P(a)  da,  is 

form  a coinpletem.^whfchtw  j OVe  suffident  eigenstates,  to 

not  be  measurable.  ’ °118Ic  ere^  as  a single  quantity,  would 

to  form™  complete  8et?^t,it*°n  ^ " the  «"***  for  the  eigenstates  . 
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equal  to  the  average  value  of  that  function  of  f which  is  equal  to 
unity  for  (■  lying  within  the  range  a to  a-\ -do,  and  zero  otherwise. 
This  function  of  f has  a meaning  according  to  our  general  theory  of 
functions  of  an  observable.  Denoting  it  by  *(£),  we  have 

P{a)da  = <x|x(f)|z>.  (46) 

If  the  range  a to  a-\-da  does  not  include  any  eigenvalues  of  £,  we 
have  as  above  %(£)  — 0 and  P(a)  = 0.  If  lx)  is  not  normalized,  the 
right-hand  sides  of  (45)  and  (46)  will  still  be  proportional  to  the 
probability  of  £ having  the  value  a and  lying  within  the  range  a to 
a-f-da  respectively. 

The  assumption  of  § 10,  that  a measurement  of  f is  certain  to  give 
the  result  £'  if  the  system  is  in  an  eigenstate  of  f belonging  to  the 
eigenvalue  f,  is  consistent  with  the  general  assumption  for  physical 
interpretation  and  can  in  fact  be  deduced  from  it.  Working  from  the 
general  assumption  we  see  that,  if  |f>  is  an  eigenket  of  £ belonging 
to  the  eigenvalue  then,  in  the  case  of  discrete  eigenvalues  of  f , 
If')  = 0 unless  a = £', 
and  in  the  case  of  a range  of  eigenvalues  of  f 

X(f)lf')  = 0 unless  the  range  a to  «+da  includes  f'. 

In  either  case,  for  the  state  corresponding  to  |£'>,  the  probability  of 
f having  any  value  other  than  f'  is  zero. 

An  eigenstate  of  f belonging  to  an  eigenvalue  lying  in  a range 
is  a state  which  cannot  strictly  be  realized  in  practice,  since  it  would 
need  an  infinite  amount  of  precision  to  get  f to  equal  exactly 
The  most  that  could  be  attained  in  practice  would  be  to  get  f to  lie 
within  a narrow  range  about  the  value  The  system  would  then 
be  in  a state  approximating  to  an  eigenstate  of  $.  Thus  an  eigenstate 
belonging  to  an  eigenvalue  in  a range  is  a mathematical  idealization 
of  what  can  be  attained  in  practice.  All  the  same  such  eigenstates 
play  a very  useful  role  in  the  theory  and  one  could  not  very  well  do 
without  them.  Science  contains  many  examples  of  theoretical  con- 
cepts which  are  limits  of  things  met  with  in  practice  and  are  useful 
for  the  precise  formulation  of  laws  of  nature,  although  they  are  not 
realizable  experimentally,  and  this  is  just  one  more  of  them.  It  may 
be  that  the  infinite  length  of  the  ket  vectors  corresponding  to  these 
eigenstates  is  connected  with  their  unrealizability,  and  that  all  realiz- 
able states  correspond  to  ket  vectors  that  can  be  normalized  and  that 
form  a Hilbert  space. 
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« the  state  eoLspondTto ' the  kef  vecto^f “*/  *”  'jb'ervaMe» 
f ™d  ,,  we  should  then  have  £ 1“°^  “d  the  °‘«™hles  ate 

&A>  = eiA>, 

V\A > = 

IedZf  aild  *'  arC  eigenValu-  **  «*  V respectively.  We  can  now 


V?'\A>  = vfjA>) 


fvlA>  = fv'jA>~fv'jA>==fvjA> 

Z . (fv-vf)IA)  = 0. 

. 18  8u8ge«te  that  the  chances  for  tl,«  • a 
eigenstate  are  most  favourable  if  £n—  t ~n  °f  & simu,taneous 
commute.  If  thev  do  not  comm,, l • “ , th° two  observables 

impossible,  but  is'  rather  e.vceptiona o“ 1“"°°'“  eig'ns““*  « 00. 

U"<  aul  so  many  eLuhot,.  0ther  h*nd'  ‘>'""'5  *> 

complete  set , as  will  now  be  proved  instates  that  they  form  a 

- 

se"kets  °ff  ™ “• 


W-flfv'c)#'  + 2l£rv'd>' 

r 


lit  izfjZun  sh:T™  s“?e” w <~JT 

0 = (’?  — V)IV>  = f (r)~n')\£'v'c\  fit-  , v , 

"vr  ,,  , +Z(v-V')iev'd>.  (48) 

2 OW  the  ket  (v~v')ltv'd>  satisfies 

flo-rtiw-  = <v-yWv.d> 

Showing  that  it  is  an  eigenket  of  (LI  • ” 

and  similarly  the  ket  ■ °agmg  to  the  eigenvalue  £r, 

the  eigenvalue  f.  EgultioV  L,  lh“  "8enket  °f  f to 

of  eigenkets  of  f equal  to  zero  which  as  we' h“  *,,tegr“l  "l,,s  * ™m 

’ S * 6 have  seen  with  equation 
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(31),  is  impossible  unless  the  integrand  and  every  term  in  the  sum 
vanishes.  Hence  - 

(W)[fV<>>  = 0,  (y-y')\ty'd>  = 0, 

so  that  all  the  kets  appearing  on  the  right-hand  side  of  (47)  are 
eigenkets  of  t;  as  well  as  of  £.  Equation  (47)  now  gives  \y')  expanded 
in  terms  of  simultaneous  eigenkets  of  £ and  y.  Since  any  ket  can  be 
expanded  in  terms  of  eigenkets  of  y,  it  follows  that  any  ket  can 
be  expanded  in  terms  of  simultaneous  eigenkets  of  £ and  y,  and  thus 
the  simultaneous  eigenstates  form  a complete  set. 

The  above  simultaneous  eigenkets  of  £ and  y,  and  \£ry'd}, 

are  labelled  by  the  eigenvalues  £’  and  17',  or  £r  and  y,  to  which  they 
belong,  together  with  the  labels  c and  d which  may  also  be  necessary. 
The  procedure  of  using  eigenvalues  as  labels  for  simultaneous  eigen- 
vectors will  be  generally  followed  in  the  future,  just  as  it  has  been 
followed  in  the  past  for  eigenvectors  of  single  observables. 

The  converse  to  the  above  theorem  says  that,  if  £ and  y are  two 
observables  such  that  their  simultaneous  eigenstates  form  a complete  set, 
then  £ and  y commute.  To  prove  this,  we  note  that,  if  |£'i/>  is  a 
simultaneous  eigenket  belonging  to  the  eigenvalues  £'  and  if, 

= (fv--vnifv>  = o-  - m 

Since  the  simultaneous  eigenstates  form  a complete  set,  an  arbitrary 
ket  j P)  can  be  expanded  in  terms  of  simultaneous  eigenkets  i£'V>> 
for  each  of  which  (49)  holds*  and  hence 

[£y-y£)\P>  - 0 

and  so  £y—y£  = 0. 

The  idea  of  simultaneous  eigenstates  may  bo  extended  to  more 
than  two  observables  and  the  above  theorem  and  its  converse  still 
hold,  i.e.  if  any  set.  of  observables  commute,  each  with  all  the  others, 
their  simultaneous  eigenstates  form  a complete  set,  and  conversely. 
The  same  arguments  used  for  the  proof  with  two  observables  are 
adequate  for  the  general  case;  e.g.,  if  we  have  three  commuting 
observables  £,  y,  £,  we  can  expand  any  simultaneous  eigenket  of  £ 
and  y in  terms  of  eigenkets  of  l and  then  show  that  each  of  these 
eigenkets  of  £ is  also  an  eigenket  of  £ and  of  y.  Thus  the  simultaneous 
eigenket  of  £ and  y is  expanded  in  terms  of  simultaneous  eigenkets 
off  y,  and  £,  and  since  any  ket  can  be  expanded  in  terms  of  simul- 
taneous eigenkets  of  £ and  y,  it  can  also  be  expanded  in  terms  of 
simultaneous  eigenkets  off.  n.  and  l. 
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SSSSSS 

:z;jzzz r:r^:^ztsTtr:r 

, /(^>  ^^•■•)!W..o=/(f,v1r,...)rvr...>(  (5o) 

ere  [£  t)  £ •••)  is  any  simultaneous  eigenket  of  £■*>  ■?  k«i 
to  the  eigenvalues  V r pr  « . ?> ’?>  £>—  belonging 

defined  iy  SHerjo  ZZ'tLt/ZT  n'^mpl  2*7*“' 
mined  by  (50),  that  is  completely  deter- 

/(?Tv. ~Z~) 

rrr?”8  .t  (s,7)'  ™d  ti,at  »*u.i  is  a fu„ction 

/ (£>  *?>  £>•*•)  is  real  and  is  an  observable.  * 

We  can  now  proceed  to  generalize  the  results  (45)  and  (46)  Given 

iSiaraT^r^Trr^ 

f“‘  just  the  produet  in  any  order'of  the  factors  £ s Y'  ‘defT  h 

;.,ue  Of  th/t  t 'for  r;  r £ TS 

fffceWe™8  7'“”  “'6'C'-  ™P“tmJ.V  f»r  that  state.  Thus 
if  the  state  corresponds  to  the  normalized  ket  vector  l*s  we  eet  f 

our  general  assumption  for  physical  interpret.^  ? 

P<*c...  = <^!3foS768f(....|a:>.  (6j) 

Pabe...  is  zero  unless  each  of  the  numbers  n h r 

etenv^eSP°nding  °bs€nrable'  If  aEy  of  the  numbers  aT7&  J an 

P 272  nranRe°fKigenValUeS  °f  thC  w^Ponding.ohiin4le 

a*...  will  usually  again  be  zero,  but  in  this  case  we  ought  to  replace  * 
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the  requirement  that  this  observable  shall  have  exactly  one  value  by 
the  requirement  that  it  shall  have  a value  lying  within  a small  range, 
which  involves  replacing  one  of  the  8 factors  in  (51)  by  a factor  like 
the  y(£)  of  equation  (46).  On  carrying  out  such  a replacement  for 
each  of  the  observables  £,  g,  whose  corresponding  numerical 
value  a,  b,  c,...  lies  in  a range  of  eigenvalues,  we  shall  get  a proba- 
bility which  does  not  in  general  vanish. 

If  certain  observables  commute,  there  exist  states  for  which  they  all 
have  particular  values,  in  the  sense  explained  at  the  bottom  of  p.  46, 
namely  the  simultaneous  eigenstates.  Thus  one  can  give  a meaning  to 
several  commuting  observables  having  values  at  the.  same  time.  Further,  we 
see  from  (51)  that  for  any  state  one  can  give  a meaning  to  the  probability 
of  particular  results,  being  obtained  for  simultaneous  measurements  of 
several  commuting  observables.  This  conclusion  is  an  important  new 
development.  In  general  one  cannot  make  an  observation  on  a 
system  in  a definite  state  without  disturbing  that  state  and  spoiling 
it  for  the  purposes  of  a second  observation.  One  cannot  then  give 
any  meaning  to  the  two  observations  being  made  simultaneously. 
The  above  conclusion  tells  us,  though,  that  in  the  special  case  when 
the  two  observables  commute,  the  observations  are  to  be  considered 
as  non-interfering  or  compatible,  in  such  a way  that  one  can  give  a 
meaning  to  the  two  observations  being  made  simultaneously  and  can 
discuss  the  probability  of  any  particular  results  being  obtained.  The 
two  observ  ations  may,  in  fact,  be  considered  as  a single  observation 
of  a more  complicated  type,  the  result  of  which  is  expressible  by  two 
numbers  instead  of  a single  number.  From  the  point  of  view  of  general 
theory,  any  two  or  more  commuting  observables  may  be  counted  as  a 
single  observable,  the  result  of  a measurement  of  which  consists  of  two  or 
more  numbers.  The  states  for  which  this  measurement  is  certain  to 
lead  to  one  particular  result  are  the  simultaneous  eigenstates. 


Ill 

REPRESENTATIONS 

14.  Basic  vectors 

In  the  preceding  chapters  we  set  up  an  algebraic  scheme  involving 
certain  abstract  quantities  of  three  kinds,  namely  bra  vectors,  ket 
vectors,  and  linear  operators,  and  we  expressed  some  of  the  funda- 
mental laws  of  quantum  mechanics  in  terms  of  them.  It  would  be 
possible  to  continue  to  develop  the  theory  in  terms  of  these  abstract 
quantities  and  to  use  them  for  applications  to  particular  problems. 
However,  for  some  purposes  it  is  more  convenient  to  replace  the 
abstract  quantities  by  sets  of  numbers  with  analogous  mathematical 
properties  and  to  work  in  terms  of  these  sets  of  numbers.  The  proce- 
dure is  similar  to  using  coordinates  in  geometry,  and  has  the  advan- 
tage of  giving  one  greater  mathematical  power  for  the  solving  of 
particular  problems. 

The  way  in  which  the  abstract  quantities  are  to  be  replaced  by 
numbers  is  not  unique,  there  being  many  possible  ways  corresponding 
to  the  many  systems  of  coordinates  one  can  have  in  geometry.  Each 
of  these  ways  is  called  a representation  and  the  set  of  numbers  that 
replace  an  abstract  quantity  is  called  the  representative  of  that 
abstract  quantity  in  the  representation.  Thus  the  representative  Of 
an  abstract  quantity  corresponds  to  the  coordinates  of  a geometrical 
object.  When  one  has  a particular  problem  to  work  out  in  quantum 
mechanics,  one  can  minimize  the  labour  by  using  a representation 
in  which  the  representatives  of  the  more  important  abstract  quanti- 
ties occurring  in  that  problem  are  as  simple  as  possible. 

To  set  up  a representation  in  a general  way,  we  take  a complete 
set  of  bra  vectors,  i.e.  a set  such  that  any  bra  can  be  expressed 
linearly  in  terms  of  them  (as  a sum  or  an  integral  or  possibly  an 
integral  plus  a sum).  These  bras  we  call  the  basic  bras  of  the  repre- 
sentation. They  are  sufficient,  as  we  shall  see,  to  fix  the  representation 
completely. 

Take  any  ket  | a)  and  form  its  scalar  product  with  each  of  the  basic 
bras.  The  numbers  so  obtained  constitute  the  representative  of  ja>. 
They  are  sufficient  to  determine  the  ket  |a>  completely,  since  if  there 
is  a second  ket,  jaq)  say,  for  which  these  numbers  are  the  same,  the 
difference  \a')—\al')  will  have  its  scalar  product  with  any  basic  bra 

8506.67 
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vanishing,  and  hence  its  scalar  product  with  any  bra  whatever  will 
vanish  and  |a>  — |ax>  itself  will  vanish. 

We  may  suppose  the  basic  bras  to  be  labelled  by  one  or  more 
parameters,  Ax,  A2,...,  Au,  each  of  which  may  take  on  certain  numerical 
values.  The  basic  bras  will  then  be  written  <AX  A2...AU|  and  the  repre- 
sentative of  |o>  will  be  written  <AX  A2...Au|a>.  This  representative  will 
now  consist  of  a set  of  numbers,  one  for  each  set  of  values  that 
Aj,  Aa,...,Au  may  have  in  their  respective  domains.  Such  a set  of 
numbers  just  forms  a f unction  of  the  variables  Ax,  A2,...,  Ai(.  Thus  the 
representative  of  a ket  may  be  looked  upon  either  as  a set  of  numbers 
or  as  a function  of  the  variables  used  to  label  the  basic  bras. 

If  the  number  of  independent  states  of  our  dynamical  system  is 
finite,  equal  to  n say,  it  is  sufficient  to  take  n basic  bras,  which  may 
be  labelled  by  a single  parameter  A taking  on  the  values  1, 2, 3,...,  n. 
The  representative  of  any  ket  |o>  now  consists  of  the  set  of  n numbers 
<l|a>,  <2|a>,  <3|a>,...,  <?i|a>,  which  are  precisely  the  coordinates  of 
the  vector  |o>  referred  to  a system  of  coordinates  in  the  usual  way. 
The  idea  of  the  representative  of  a ket  vector  is  just  a generalization 
of  the  idea  of  the  coordinates  of  an  ordinary  vector  and  reduces  to 
the  latter  when  the  number  of  dimensions  of  the  space  of  the  ket 
vectors  is  finite. 

In  a general  representation  there  is  no  need  for  the  basic  bras  to 
be  all  independent.  In  most  representations  used  in  practice,  how- 
ever, they  are  all  independent,  and  also  satisfy  the  more  stringent 
condition  that  any  two  of  them  are  orthogonal.  The  representation 
is  then  called  an  orthogonal  representation. 

Take  an  orthogonal  representation  with  basic  bras  <AXA2...A„|, 
labelled  by  parameters  AX,A2,...,AU  whose  domains  are  all  real.  Take 
a ket  |«>  and  form  its  representative  <AxA2...AJa>.  Now  form  the 
numbers  Ax<AxA2...Au|a>  and  consider  them  as  the  representative  of 
a new  ket  |6>.  This  is  permissible  since  the  numbers  forming  the 
representative  of  a ket  are  independent,  on  account  of  the  basic  bras 
being  independent.  The  ket  |6>  is  defined  by  the  equation 

<AxA2...Au|6>  = Ax<AxA2...AJa>. 

The  ket  |6>  is  evidently  a linear  function  of  the  ket  |a>,  so  it  may 
be  considered  as  the  result  of  a linear  operator  applied  to  |a>.  Calling 
this  linear  operator  Lv  we  have 

16>  = Lx  jo> 
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rim  equation  holds  for  any  ket  |«>,  so  we  get 
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there  is  only  one  independent  simultaneoi 


belonging  to  any  set  ot  eigenvalues 


'igenbra  of  flt£t j 


Then  we  may  take 


ttiese  simultaneous  eigenbras,  with  arbitrary  numerical  coefficient*.  as 
our  basic  bras.  They  are  all  orthogonal  on  account  of  the  orthogonality 
theorem  (any  two  of  them  will  have  at  least  one  eigenvalue  different, 
which  is  sufficient  to  make  them  orthogonal)  and  there  arc  sufficient 
of  them  to  form  a complete  set,  from  a result  of  § !;}  They  may 
conveniently  be  labelled  by  the  eigenvalues  £1,  £ to  which  they 

belong,  so  that  one  of  them  is  written  <£&...£  h 
Passing  now  to  the  general  case  when  there  are  several  independent 
simultaneous  eigenbras  of  £,  £,  belonging  to  some  sets  of  eigen- 

values, we  must  pick  out  from  all  the  simultaneous  eigenbras  belong- 
ing to  a set  of  eigenvalues  £,  £,  a complete  subset,  the  members 

of  which  are  ail  orthogonal  to  one  another.  (The  condition  of  com- 
pleteness here  means  that  any  simultaneous  eigenhra  belonging  to  the 
eigenvalues  can  be  expressed  linoarlv  in  terms  of  the 

members  of  the  subset.)  We  must  do  this  for  each  set  of  eigenvalues 
tv&.—'i'u  and  then  put  all  the  members  of  all  the  subsets  together 
and  take  them  as  the  basic  bras  of  the  representation.  These  bras 
are  all  orthogonal,  two  of  them  being  ortliogonal  from  the  orthogona- 
lity theorem  if  they  belong  to  different  sets  of  eigenvalues  and  from 
the  special  way  in  which  they  were  chosen  if  they  belong  to  the  same 
set  of  eigenvalues,  and  they  form  altogether  a complete  set  of  bras, 
aa  any  bra  can  be  expressed  linearly  in  terms  of  simultaneous  eigen- 
bras and  each  simultaneous  eigenbra  can  then  be  expressed  linearly 
m terms  of  the  members  of  a subset.  There  are  infinitely  many  way's 
of  choosing  the  subsets,  and  each  way  provides  one  orthogonal 
representation. 

for  labelling  the  basic  bras  in  this  general  case,  we  may  use  tho 
eigenvalues  to  which  they  belong,  together  with  certain 

additional  real  variables  A1(A2,...,  Ar  say,  which  must  be  introduced  to 
distinguish  basic  vectors  belonging  to  the  same  set  of  eigenvalues 
from  one  another.  A basic  bra  is  then  written  <££.  AtA2  A | 
Corresponding  to  the  variables  W...  At,  we  can  define  linear 
operators  Lv  Lv  by  equations  like  (1)  and  can  show  that  these 
linear  operators  have  the  basic  bras  as  eigenbras,  and  that  they  are 

real  and  observables,  and  that  they  commute  with  one  another  and 

with  the  £ ‘ 

the  com 


liie  basic  bras  are  now  simultaneous  eigenbras  of  all 


mmting  observables  £, Lv  L2>..„  Lv 
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the  basic  vectors,  rather  than  leave  their  lengths  arbitrary,  and  so  ^ 
introduce  a further  stage  of  simplification  into  the  representation. 
However,  it  is  possible  to  normalize  them  only  if  the  parameters  ' 

which  label  them  all  take  on  discrete  values.  If  any  of  these  para-  ‘ 

meters  are  continuous  variables  that  can  take  on  all  values  in  a range,  ■ 
the  basic  vectors  are  eigenvectors  of  some  observable  belonging  to 
eigenvalues  in  a range  and  are  of  infinite  length,  from  the  discussion 
in  § 10  (see  p.  39  and  top  of  p.  40).  Some  other  procedure  is  then  * 
needed  to  fix  the  numerical  factors  by  which  the  basic  vectors  may 
be  multiplied.  To  get  a convenient  method  of  handling  this  question  j 
a new  mathematical  notation  is  required,  which  will  be  given  in  the  f 
next  section.  j 

15.  The  S function  j 

Our  work  in  § 10  led  us  to  consider  quantities  involving  a certain  | 
kind  of  infinity.  To  get  a precise  notation  for  dealing  with  these  ■. 
infinities,  we  introduce  a quantity  8(a;)  depending  on  a parameter  x 1 
satisfying  the  conditions 

co 

J 8{x)  dx  = 1 

—co 

8(x)  = 0 for  x ^ 0. 

To  get  a picture  of  8(x),  take  a function  of  the  real  variable  x which  * 
vanishes  everywhere  except  inside  a small  domain,  of  length  e say, 
surrounding  the  origin  x = 0,  and  which  is  so  large  inside  this  domain 
that  its  integral  over  this  domain  is  unity.  The  exact  shape  of  the 
function  inside  this  domain  does  not  matter,  provided  there  are  no 
unnecessarily  wild  variations  (for  example  provided  the  function  | 
is  always  of  order  e-1).  Then  in  the  limit  e ->  0 this  function  will  go 
over  into  8(x). 

8(x)  is  not  a function  of  x according  to  the  usual  mathematical  j 
definition  of  a function,  which  requires  a function  to  have  a definite 
value  for  each  point  in  its  domain,  but  is  something  more  general, 
which  we  may  call  an  ‘improper  function’  to  show  up  its  difference 
from  a function  defined  by  the  usual  definition.  Thus  8(x)  is  not  a ‘ 
quantity  which  can  be  generally  used  in  mathematical  analysis  like 
an  ordinary  function,  but  its  use  must  be  confined  to  certain  simple  j 
types  of  expression  for  which  it  is  obvious  that  no  inconsistency  ! 
can  arise. 
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The  most  important  property  of  8(a)  is  exemplified  by  the  follow- 
ing equation,  * 

Jf(x)S(x)dx=f(0),  - (3) 

— ao 

where /(x)  is  any  continuous  function  of  x.  We  can  easily  see  the 
validity  of  this  equation  from  the  above  picture  of  Six).  The  left- 
and  side  of  (3)  can  depend  only  on  the  values  of  /(x)  very  close 
o the  origin,  so  that  we  may  replace  f(x)  by  its  value  at  the  origin 
f(0)  without  essential  error.  Equation  (3)  then  follows  from  the 
first  of  equations  (2).  By  making  a change  of  origin  in  (3),  we  can 
deduce  the  formula 

CO 

f /(x)S(x-a)  dx  = f(a),  (4) 

— 00 

where  a is  any  real  number.  Thus  the  process  of  multiplying  a function 
ojxby  b(x-a)  and  integrating  over  all  x is  equivalent  to  the  process  of 
substituting  a for  x.  This  general  result  holds  also  if  the  function  of  x is 
not  a numerical  one,  but  is  a vector  or  linear  operator  depending  on  x. 

ihe  range  of  integration  in  (3)  and  (4)  need  not  be  from  -oo  to  oo 
but  may  be  over  any  domain  surrounding  the  critical  point  at  which 
the  8 function  does  not  vanish.  In  future  the  limits  of  integration 
will  usually  be  omitted  in  such  equations,  it  being  understood  that 
the  domain  of  integration  is  a suitable  one. 

Equations  (3)  and  (4)  show  that,  although  an  improper  function 
oes  not  itself  have  a well-defined  value,  when  it  occurs  as  a factor 
m an  integrand  the  integral  has  a well-defined  value.  In  quantum 
heory,  whenever  an  improper  function  appears,  it  wifi  be  something 
which  is  to  be  used  ultimately  in  an  integrand.  Therefore  it  should  be 
possible  to  rewrite  the  theory  in  a form  in  which  the  improper  func- 
tions appear  all  through  only  in  integrands.  One  could  then  eliminate 
the  improper  fiinctions  altogether.  The  use  of  improper  functions 
hus  does  not  involve  any  lack  of  rigour  in  the  theory,  but  is  merely 
a convenient  notation,  enabling  us  to  express  in  a concise  form 
certain  relations  which  we  could,  if  necessary,  rewrite  in  a form  not 
involving  improper  functions,  but  only  in  a cumbersome  way  which 
would  tend  to  obscure  the  argument. 

An  alternative  way  of  defining  the  8 function  is  as  the  differential 
coefficient  e (x)  of  the  function  e(x)  given  by 

e(z)  = 0 (x  < 0)  \ 

— 1 (x  > 0).  J ^ 
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We  may  verify  that  this  is  equivalent  to  the  previous  definitic.  oy 
substituting  e'(.r)  for  8(.r)  in  the  left-hand  side  of  (3)  and  integrating 
by  parts.  We  find,  tor  g1  and  g2  two  positive  numbers, 

/ /(*)«'(*)  dx  = [f(x)<(x)]\-  f f'(x)e(x)  dx 
-Of  4, 

= /(?i)~  Jf'(x)  dx 
= /(0), 

in  agreement  with  (3).  The  8 function  appears  whenever  one  differen- 
tiates a discontinuous  function. 

There  are  a number  of  elementary  equations  which  one  can  write 
down  about  8 functions.  These  equations  are  essentially  rules  of 
manipulation  for  algebraic  work  involving  8 functions.  The  meaning 
of  any  of  these  equations  is  that  its  two  sides  give  equivalent  results 
as  factors  in  an  integrand. 

Examples  of  such  equations  are 


8(— a:)  = 8(*) 
a;  8(a)  = 0,  V 

8 (ax)  — a-^x)  (a  >0),  J "* 
8(x*-a*)  = i-a-]{5(x-a)+8(x+a)}  (a  > 0), 
J 8(a—x)  dx8(x—b ) = 8(a— 6), 

f(x)8(x~a)  — f(a)8(x  -a). 


(6) 

(7) 

(8) 
(9) 

(10) 

(11) 


Equation  (6),  which  merely  states  that  S(ar)  is  an  even  function  of  its 
variable  x is  trivial.  To  verify  (7)  take  any  continuous  function  of 
x,  f(x).  Then 


J f(x)z8(x)  dx  = 0, 


from  (3).  Thus  x8(x)  as  a factor  in  an  integrand  is  equivalent  to 
zero,  which  is  just  the  meaning  of  (7).  (8)  and  (9)  may  be  verified 
by  similar  elementary  arguments.  To  verify  (10)  take  any  continuous 
function  of  a,  f(a).  Then 


/ /(<*)  da  J 8(a—x)  dx  8{x-b)  = J 8{x-b)  dx  J f(a)  da  8(a-x) 


= j S(x-b)  dxf(x)  = J /(a)  da  8(a—b). 

Thus  the  two  sides  of  (10)  are  equivalent  as  factors  in  an  integrand 
with  a as  variable  of  integration.  It  may  be  shown  in  the  same  way 
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16.  Properties  of  the  basic  vectors 

Using  the  notation  of  the  8 function , we  can  proceed  with  the  theory 
of  representations.  Let  us  suppose  first  that  we  have  a single  observ- 
able £ forming  by  itself  a complete  commuting  set,  the  condition  for 
this  being  that  there  is  only  one  eigenstate  of  £ belonging  to  any 
eigenvalue  £ , and  let  us  set  up  an  orthogonal  representation  in  which 
the  basic  vectors  are  eigenvectors  of  £ and  are  written  <f  |,  |f  >. 

In  the  case  when  the  eigenvalues  of  £ are  discrete,  we  can  normalize 
the  basic  vectors,  and  we  then  havo 


<f  lf>  = 0 

<nr>  = i. 

These  equations  can  be  combined  into  the  single  equation 

<fir>  = S^„  (16) 

where  the  symbol  8 with  two  suffixes,  which  we  shall  often  use  in  the 
future,  has  the  meaning 

8„  = 0 when  r is  i 

i , (17) 

==  1 when  r — s.  f 

In  the  case  when  the  eigenvalues  of  £ are  continuous  we  cannot 
normalize  the  basic  vectors.  If  we  now  consider  the  quantity  (£'  |f  > 
with  £'  fixed  and  £"  varying,  we  see  from  the  work  connected  with 
expression  (29)  of  § 10  that  this  quantity  vanishes  for  £’  ^ £’  and 
that  its  integral  over  a range  of  £"  extending  through  the  value  £' 
is  finite,  equal  to  c say.  Thus 

<f  if>  = c8(£'~£"). 

From  (30)  of  § 10,  c is  a positive  number.  It  may  vary  with  £',  so 
we  should  write  it  c(£')  or  c'  for  brevity,  and  thus  we  have 

<f  If')  - c'8(£’~£").  (18) 

Alternatively,  we  have 

<f|0  = c'8(f-n,  (19) 

where  c"  is  short  for  c(£r),  the  right-hand  sides  of  (18)  and  (19)  being 

equal  on  account  of  (11).  3 

Let  us  pass  to  another  representation  whose  basic  vectors  are 
eigenvectors  of  £,  the  new  basic  vectors  being  numerical  multiples  of 
the  previous  ones.  Calling  the  new  basic  vectors  <£'*|,  \£'*y,  with  the 
additional  label  * to  distinguish  them  from  the  previous  ones,  we  have 

<f*l  = &'<£'!,  If  *>  = Fjf  >, 
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Where  k‘  is  short  for  ) and  is  a number  depending  on  f.  We  get 

<r*r*>  = k'kx?  if>  = k'Pc1  &{?-$") 

With  the  help  of  (18).  This  may  be  written 

from  (l  iy  By  choosing  lc'  so  that  its  modulus  is  c'~i,  which  is  possible 
since  c'  is  positive,  we  arrange  to  have  P 

<nr*>  - m 

The  lengths  of  the  new  basic  vectors  are  now  fixed  so  as  to  make  the 

— r ^ SimPl6  “ P°SSib,e-  The  Way  these  le»gths  were 
Mon  nZ  ZTZ  “ °8°“  *°  the  of  the  basic 

(16)  with  the  TT  . f ■ etuati°n  <2»>  Wig  of  the  form  of 

IZtTondt  w 7n°“  "Plaoi"8  th°  8 oy-bol  V(.  of 

»ad“  h“l  droB  u o'.  fhT“nU<>  *°  W°rk  ‘he  new representation 

be  written  “ “ *°  **ve  Thu,  (20)  will  now 

<f  If  > = S ((•-(■).  (21) 

We  can  develop  the  theory  on  closely  parallel  line,  for  the  discrete 
...d  continuous  cases.  For  the  discrete  ease  we  have,  using  <«, 

2 If  ><f  If  > - 2 if  >s„.  = ,f  >, 

a^Setoffttrdh™'  “U  e%erlu“'  «1“«o»  holds  for 

nj  has, cl tot  |f  >and  hence,  since  the  basic  lets  form  a complete  set, 

I IfXfl  = I.  (22j 

This  is  a useful  equation  expressing  an  important  property  of  the 

Similarly,  for  the  continuous  case  we  have,  using  (21), 

/ If  > df  <f  |f>  = / If)  df  5(f-f , = |f  > (23) 

from  (4)  applied  with  a ket  vector  for  f(x)  the  mn™  * a- 
^the  range  of  eigenva.ues.  This  £ ' 

/ IO  df  <f  | = i. 


(24) 
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This  is  of  the  same  form  as  (22)  with  an  integral  replacing  the  sum. 
Equations  (21)  and  (24)  give  the  fundamental  properties  of  the  basic 
vectors  for  the  continuous  case. 

Equations  (22)  and  (24)  enable  one  to  expand  any  bra  or  ket  in 
terms  of  the  basic  vectors.  For  example,  we  get  for  the  ket  |P>  in  the 
discrete  case,  by  multiplying  (22)  on  the  right  by  |P>, 

l-P)  = 2 If'Xf'l-P).  (25) 

C 

which  gives  |P>  expanded  in  terms  of  the  jO’s  and  shows  that  the 
coefficients  in  the  expansion  are  (£'|P>,  which  are  just  the  numbers 
forming  the  representative  of  |P>.  Similarly,  in  the  continuous  case, 

\P>  = f l£'>df<aF>,  (26) 

giving  |P>  as  an  integral  over  the  |£'>’s,  with  the  coefficient  in  the 
integrand  again  just  the  representative  <f'|P>  of  |P).  The  conjugate 
imaginary  equations  to  (25)  and  (26)  would  give  the  bra  vector  <Pj 
expanded  in  terms  of  the  basic  bras. 

Our  present  mathematical  methods  enable  us  in  the  continuous 
case  to  expand  any  ket  as  an  integral  of  eigenkets  of  f . If  we  do  not 
use  the  8 function  notation,  the  expansion  of  a general  ket  will  consist 
of  an  integral  plus  a sum,  as  in  equation  (25)  of§  10,  but  the  8 function 
enables  us  to  replace  the  sum  by  an  integral  in  whicli  the  integrand 
consists  of  terms  each  containing  a 8 function  as  a factor.  For 
example,  the  eigenket  j£">  may  be  replaced  by  an  integral  of  eigen- 
keis,  as  is  shown  by  the  second  of  equations  (23). 

If  (Q\  is  any  bra  and  jP>  any  ket  we  get,  by  further  applications 
of  (22)  and  (24),  <Q| p>  = J <^IO<f|P>  (27) 

(■ 

for  discrete  £ ' and 

«2ip>  = / «2in  d?  <np>  (28) 

for  continuous  These  equations  express  the  scalar  product  of  <Q| 
and  |P>  in  terms  of  their  representatives  <(?|£'>  and  <£'|P>.  Equa- 
tion (27)  is  just  the  usual  formula  for  the  scalar  product  of  two 
vectors  in  terms  of  the  coordinates  of  the  vectors,  and  (28)  is  the 
natural  modification  of  this  formula  for  the  case  of  continuous 
with  an  integral  instead  of  a sum. 

The  generalization  of  the  foregoing  work  to  the  case  when  f has 
both  discrete  and  continuous  eigenvalues  is  quite  straightforward. 
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Using  £*  and  f to  denote  discrete  eigenvalues  and  £'  and  £'  to  denote 
continuous  eigenvalues,  we  have  the  set  of  equations 

\£ri£*>  = <^jO  = 0,  <f'|f>  = S(f-f')  (29) 

as  the  generalization  of  (16)  or  (21).  These  equations  express  that 
the  basic  vectors  are  all  orthogonal,  that  those  belonging  to  discrete 
eigenvalues  are  normalized  and  those  belonging  to  continuous  eigen- 
values have  their  lengths  fixed  by  the  same  rule  as  led  to  (20).  From 
(29)  we  can  derive,  as  the  generalization  of  (22)  or  (24), 

iirxm/iodf  <n  = i,  (so) 

the  range  of  integration  being  the  range  of  continuous  eigenvalues. 
With  the  help  of  (30),  we  get  immediately 

ip>  - 1 \exe\py+  f io  df  ip)  (3i) 

as  the  generalization  of  (25)  or  (26),  and 


<q\p>  = 2 <Q\e  y<?\p>+  / «?ir>  dr  <r  ip>  (32) 

as  the  generalization  of  (27)  or  (28). 

Let  us  now  pass  to  the  general  case  when  we  have  several  commuting 
observables  forming  a complete  commuting  set  and  set  up 

an  orthogonal  representation  in  which  the  basic  vectors  are  simul- 
taneous eigenvectors  of  all  of  them,  and  are  written  <&..•£«!>  I ••£«>• 
Let  us  suppose  (v  X u)  have  discrete  eigenvalues  and 

£v+v—>{u  have  continuous  eigenvalues. 

Consider  the  quantity  <fi.. ft From  the 
orthogonality  theorem,  it  must  vanish  unless  each  fs  = £ for 
s — v-{- 1,..,  u.  By  extending  the  work  connected  with  expression 
(29)  of  § 10  to  simultaneous  eigenvectors  of  several  commuting 
observables  and  extending  also  the  axiom  (30),  we  find  that  the 
(u  — u)-fold  integral  of  this  quantity  with  respect  to  each  £ over 
a range  extending  through  the  value  is  a finite  positive  number. 
Calling  this  number  c\  the  denoting  that  it  is  a function  of 
£i>->£v>£v+i<-;£u,  we  can  express  our  results  by  the  equation 


<£XfX;+1..£t|£..££+1..0  = e'8(£I+1— £+1)..8(£,— Q,  (33) 
with  one  8 factor  on  the  right-hand  side  for  each  value  of  s from 
V+1  to  “•  We  now  change  the  lengths  of  our  basic  vectors  so  as  to 
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waa . «w!:s 7 (34) 

eigenvalues.  This  is  the  general, wf  f°f  ^ * w^h  continuous 

thwe  are  several  commuting  observables  ^ (21)  to  the  0Me  when 

. From  (34)  we  can  derive  as  the  if  v * COmPlete  set- 

1 the  Senerahzation  of  (22)  or  (24) 

/"/  *&«•■*<&  = l,  (S6) 

the  integral  being  a («_«!  f„u 

eigenvalues  and  the  un,ma«“  a °',<!r  *“  ‘he  S"a  »“>  eontinuon, 
e%enva,„es.  E,u.tio“~  £«"-  ■«  «-  ft  with  diso^ 
of  the  basic  vectnn,  in  the * > ^.mental  ptr,,^., 
diateiy  ante  down  the  generalisation  nf  feat  r°m  <36*  'TO  oan  imme- 
, The  case  we  have  just  considered  ^hTV>*l.“d<'f<»)or<»). 
allowing  some  of  the  £’s  to  have  both  d‘  f generaUzed  by 

The  modifications  r^^htZ  , “d  “"““"o™ 
forward,  but  will  not  be  given  here  ? f eqUatl0ns  are  q«ite  straight- 
• write  down  in  general  form.  3 ey  are  rather  cumbersome  to 

33^^  not  to  make  the 

- 5“"  of  the  Ts  instead.  CahinV  thT  f ^ 8°me  definite 

• then  have,  instead  of  (34)  8 hl! 3 function  of  the  f ’s  p'-i  we 


&+i)-M?u~£u)>  (36) 


and  instead  of  (35)  we  get 

/••/  d?v+1..dfu  <^11==  ^ (3?) 

P is  called  the  weight  function  ot 

hetog  the  ‘weight’  attached  to  a sLTvoln"8''!^0”' 

Of  the  variables  ^ 11  oIume  element  of  the  space 

funct*^  all  had  the  weight 

entirely  a matter  of  convenience  and  d ^ function  not  unity  is 
’ mathematical  power  of  the  representatir8  n ***  *aything  the 
<A  representation  with  the  weight  funct  J <*~&l 

S function  P are  connected  with 


* ■»  pbopebties  of  the  basic  vectors 

the  basic  bras  ^ i nf  |la 

weight  function  unify  by  * representation  with  the 

i — />'-*<£..  £'  I 

as  js  easily  verified  a„  ov  . „ 1 ~u  > (38) 

non-unit  weight  function  oecZ 

the  polar  and  azimuthal  anvles  /)ar,j  t . . ' ' 0 ^ s wbi°b  are 

dimensional  space  and  one  takes  P’  = sit 6'^  ?, direction  in  th™e- 

of  solid  angle  sin  d ’ dd’d<f>'  occurring  in  (37).  ^ ^ ****  th®  element 

Il?irePreSentatl0n  °f  linear  °Perators 

numbers.  We^w hTve^odr^611*  bra  vectors  by  sets  of 

to  have  a ~ in  order 

b-  veCt^^s~ 

a comijlet^set  ^f^coinmutin^^servables^3^ e°U^e'^  j£VeC^°rS  °f 

linear  operator,  we  take  a general  basic  brltfe?'  / “ 18  My  . 
basic  ket  j £..£>  and  form  the  numbers  ^ a general 

\olIP’  t"\ 

These  numbers  are  sufficient  f-n  1 • **  ’ (39) 

first  place  they  determine  the  ket^6  since  in ‘he 

representative  of  this  ket)  -md  th»  , ( the^  Provide  the  • 

■»«  <iete, 1:  “t " „S:  ™ is  ket  f°r  0,1  the  >— 

sentative  of  the  linear  operator  n , aie  calJed  the  repre- 

«ro  more  complicated  than  the“repLcnm« "“ft^  t™'’.''’  • 

*•  ~ - --  - 

“ 8ir  — «. 

set  by  itself,  and  gu„.M  tint  it  i j 8 * c°",IlIeto  eommuting 

representative  of  ‘ 7E  h,  d T *"*  **’*»•  ?■  The 

one  had  to  write  out  these  numCeaphcS  ™bm  If 

arranging  them  would  he  as  a ^ °f 

f <£!“!(■>  <ei«ie>  <f.WP>  . " hus- 
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where.  £\£2,£3,..  are  all  the  eigenvalues  of  £.  Such  an  array  is  called 
a matrix  and  the  numbers  are  called  the  dements  of  the  matrix.  We 
make  the  convention  that,  the  elements  must  always  be  arranged  so 
that  those  in  the  same  row  refer  to  the  same  basic  bra  vector  and 
those  in  the  same  column  refer  to  the  same  basic  ket  vector. 

An  element  <f'jcr|f'>  referring  to  two  basic  vectors  with  the  same 
label  is  called  a diagonal  element  of  the  matrix,  as  all  such  elements 
lie  on  a diagonal.  If  we  put  a equal  to  unity,  we  have  from  (16)  all 
the  diagonal  elements  equal  to  unity  and  all  the  other  elements  equal 
to  zero.  The  matrix  is  then  called  the  unit  matrix. 

If  a is  real,  we  have 

<ri«D  = (4i) 

The  effect  of  these  conditions  on  the  matrix  (40)  is  to  make  the 
diagonal  elements  all  real  and  each  of  the  other  elements  equal  the 
conjugate  complex  of  its  mirror  reflection  in  the  diagonal.  The  matrix 
is  then  called  a Hermitian  matrix. 

If  we  put  a equal  to  £,  we  get  for  a general  element  of  the  matrix 

<f  HD  = f<f  ID  = (42) 

Thus  all  the  elements  not  on  the  diagonal  are  zero.  The  matrix  is 
then  called  a diagonal  matrix.  Its  diagonal  elements  are  just  equal 
to  the  eigenvalues  of  £.  More  generally,  if  we  put  a equal  to  /(£),  a 
function  of  £,  we  get 

<fl/(f)D  =/(*')»«•>  (43) 

and  the  matrix  is  again  a diagonal  matrix. 

Let  us  determine  the  representative  of  a product  a/3  of  two  linear 
operators  a and  /3  in  terms  of  the  representatives  of  the  factors. 
From  equation  (22)  with  substituted  for  £'  we  obtain 

<f  io£D  = <f  i«£  irxriPD 

= £<ri«irxfW'>.  (44) 

which  gives  us  the  required  result.  Equation  (44)  shows  that  the 
matrix  formed  by  the  elements  <H'|a(9H'’)>  equals  the  product  of  the 
matrices  formed  by  the  elements  and  <|'|/3H*>  respectively, 

according  to  the  usual  mathematical  rule  for  multiplying  matrices. 
This  rule  gives  for  the  element  in  the  rth  row  and  5th  column  of  the 
product  matrix  the  sum  of  the  product  of  each  element  in  the  rth 
row  of  the  first  factor  matrix  with  the  corresponding  element  in  the  5th 
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column  of  the  second  factor  matrix.  The  multiplication  of  matrices 
is  non -commutative,  like  the  multiplication  of  linear  operators. 

We  can  summarize  our  results  for  the  case  when  there  is  only  one 
| and  it  has  discrete  eigenvalues  as  follows: 

(i)  Any  linear  operator  is  represented  by  a matrix. 

(ii)  The  unit  operator  is  represented  by  the  unit  matrix. 

(iii)  A real  linear  operator  is  represented  by  a Hermitian  matrix. 

(iv)  £ and  functions  of  £ are  represented  by  diagonal  matrices. 

(v)  The  matrix  representing  the  product  of  two  linear  operators  is  the 
product  of  the  matrices  representing  the  two  factors. 

Let  us  now  consider  the  case  when  there  is  only  one  £ and  it  has 
continuous  eigenvalues.  The  representative  of  a is  now  <f|a|£">>  a 
function  of  two  variables  and  which  can  vary  continuously.  It 
is  convenient  to  call  such  a function  a ‘matrix’,  using  this  word  in 
a generalized  sense,  in  order  that  we  may  be  able  to  use  the  same 
terminology  for  the  discrete  and  continuous  cases.  One  of  these 
generalized  matrices  cannot,  of  course,  be  written  out  as  a two- 
dimensional  array  like  an  ordinary  matrix,  since  the  number  of  its 
rows  and  columns  is  an  infinity  equal  to  the  number  of  points  on  a 
line,  and  the  number  ot  its  elements  is  an  infinity  equal  to  the 
number  of  points  in  an  area. 

We  arrange  our  definitions  concerning  these  generalized  matrices 
so  that  the  rules  (i)-(v)  which  we  had  above  for  the  discrete  case 
hold  also  for  the  continuous  case.  The  unit  operator  is  represented 
b.y  Hf  — C)  and  the  generalized  matrix  formed  by  these  elements 
we  define  to  be  the  unit  matrix.  We  still  have  equation  (41)  as  the 
condition  for  a to  be  real  and  we  define  the  generalized  matrix  formed 
by  the  elements  (f  ja]f  ) to  be  Hermitian  when  it  satisfies  this 
condition.  £ is  represented  by 

<*W>  = *'«(£'-*')  (45) 

and/(£)  by  <f  j/(|)ir>  =/(£W'-£'),  (46) 

and  the  generalized  matrices  formed  by  these  elements  we  define  to  be 
diagonal  matrices.  From  (11),  we  could  equally  well  have  £'  and  f(f) 
as  the  coefficients  of  §(£'—£")  on  tiie  right-hand  sides  of  (45)  and  (46) 
respectively.  Corresponding  to  equation  (44)  we  now'  have,  from  (24) 

<n^r>  = j <f!«!  <ri/*;r>.  (47) 

with  an  integral  instead  ot  a sum,  and  we  define  the  generalized 
matrix  formed  by  t lie  elements  on  the  right-hand  side  here  to  be  the 
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product  of  the  matrices  formed  by  <£'jaj£">  and  With 

these  definitions  we  secure  complete  parallelism  between  the  discrete 
and  continuotis  cases  and  we  have  the  rules  (i)-(v)  holding  for  both. 

The  question  arises  how  a general  diagonal  matrix  is  to  be  defined 
in  the  continuous  case,  as  so  far  we  have  only  defined  the  right-hand 
sides  of  (45)  and  (46)  to  be  examples  of  diagonal  matrices.  One 
might  be  inclined  to  define  as  diagonal  any  matrix  whose  (£',  |") 
elements  all  vanish  except  when  differs  infinitely  little  from  f", 
but  this  would  not  be  satisfactory,  because  an  important  property 
of  diagonal  matrices  in  the  discrete  case  is  that  they  always  commute 
with  one  another  and  we  want  this  property  to  hold  also  in  the 
continuous  case.  In  order  that  the  matrix  formed  by  the  elements 
<f'MO  in  the  continuous  case  may  commute  with  that  formed  by 
the  elements  on  the  right-hand  side  of  (45)  we  must  have,  using  the 
multiplication  rule  (47), 

| <fMr><*rrs(r-n  = / ew-ntr  <rwr>. 

With  the  help  of  formula  (4),  this  reduces  to 

<*>!£'>r  = f<fl«iO  (48) 

or  (f-f'Kf'MF)  = 0. 

This  gives,  according  to  the  rule  by  which  (13)  follows  from  (12), 

<fMr>  = c'8(f-n 

where  o'  is  a number  that  may  depend  on  Thus  <£'  jo»  |£*>  is  of  the 
form  of  the  right-hand  side  of  (46).  For  this  reason  we  define  only 
matrices  whose  elements  are  of  the  form  of  the  right-hand  side  of  (46)  to 
be  diagonal  matrices.  It  is  easily  verified  that  these  matrices  all 
commute  with  one  another.  One  can  form  other  matrices  whose 
(£',  $’)  elements  all  vanish  when  ('  differs  appreciably  from  and 
have  a different  form  of  singularity  when  £'  equals  £*  [we  shall  later 
introduce  the  derivative  8'(a;)  of  the  8 function  and  8'(£'— £*)  will 
then  be  an  example,  see  § 22  equation  (19)],  but  these  other  matrices 
are  not  diagonal  according  to  the  definition. 

Let  us  now  pass  on  to  the  case  when  there  is  only  one  £ and  it  has 
both  discrete  and  continuous  eigenvalues.  Using  £r,  £s  to  denote 
discrete  eigenvalues  and  to  denote  continuous  eigenvalues,  we 
now  have  the  representative  of  a consisting  of  four  kinds  of  quanti- 
ties, <f,jtx|£*>,  <£r  | <*[£'>,  <f'H£r>>  <fWr>.  These  quantities  can  all 
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be  put  together  and  considered  to  form  a more  general  kind  of  matrix 
having  some  discrete  rows  and  columns  and  also  a continuous  range 
of  rows  and  columns.  We  define  unit  matrix,  Hermitian  matrix, 
diagonal  matrix,  and  the  product  of  two  matrices  also  for  this  more 
general  kind  of  matrix  so  as  to  make  the  rules  (i)-(v)  still  hold.  The 
details  are  a straightforward  generalization  of  what  has  gone  before 
and  need  not  be  given  explicitly. 

Let  us  now  go  back  to  the  general  case  of  several  f's,fi,f2,...,.fu. 
The  representative  of  a,  expression  (39),  may  still  be  looked  upon  as 
forming  a matrix,  with  rows  corresponding  to  different  values  of 
and  columns  corresponding  to  different  values  of 
Unless  ali  the  £’s  have-  discrete  eigenvalues,  this  matrix  will  be  of  the 
generalized  kind  with  continuous  ranges  of  rows  and  columns.  We 
again  arrange  our  definitions  so  that  the  rules  (i)-(v)  hold,  with  rule 
(iv)  generalized  to:  • 

(iv')  Each  (in  = 1 , 2,...,  u)  and  any  function  of  them  is  repre- 
sented by  a,  diagonal  matrix. 

A diagonal  matrix  is  now  defined  as  one  whose  general  element 
is  of  tie  form 


= o'  %Jt.%s8(£+1-&n)..8(&-£)  (49) 


in  the  case  when  have  discrete  eigenvalues  and  have 

continuous  eigenvalues,  c'  being  any  function  cf  the  f"s.  This  defini- 
tion is  the  generalization  of  what  we  had  with  one  £ and  makes 
diagonal  matrices  always  commute  with  one  another.  The  other 
definitions  are  straightforward  and  need  not  be  given  explicitly. 

We  now  have  a linear  operator  always  represented  by  a matrix. 
The  sum  of  two  linear  operators  is  represented  by  the  sum  of  the 
matrices  representing  the  operators  and  this,  together  with  rule  (v), 
means  that  the  matrices  are  subject  to  the  same  algebraic  relations  as 
the  linear  operators.  If  any  algebraic  equation  holds  between  certain 
linear  operators,  the  same  equation  must  hold  between  the  matrices 
representing  those  operators. 

The  scheme  of  matrices  can  be  extended  to  bring  in  the  repre- 
sentatives of  ket  and  bra  vectors.  The  matrices  representing  linear 
operators  are  all  square  matrices  with  the  same  number  of  rows  and 
columns,  and  with,  in  fact,  a one-one  correspondence  between  their 
rows  and  columns.  We  may  look  upon  the  representative  of  a ket 
|P>  as  a matrix  with  a single  column  by  setting  all  the  numbers 


72 


REPRESENTATIONS 


§ 17 


<£-£jP>  which  form  this  representative  one  below  the  other  The 
number  of  rows  in  this  matrix  will  be  the  same  as  the  number  of 
rows  or  columns  >in  the  square  matrices  representing  linear  operators. 
Such  a single-column  matrix  can  be  multiplied  on  the  left  by  a square 
matrix  representing  a linear  operator,  by  a rule 

similar  to  that  for  the  multiplication  of  two  square  matrices.  The 
product  is  another  single-column  matrix  with  elements'  given  by 

I.J-/  diUvM:  <£...£1  P>. 

From  (35)  this  is  just  equal  to  <fi...£,|«!P>,  the  representative  of 
a\Py.  Similarly  we  may  look  upon  the  representative  of  a bra  <Q| 
as  a matrix  with  a single  row  by  setting  all  the  numbers 
side  by  side.  Such  a single-row  matrix  may  be  multiplied  on  the 
right  by  a square  matrix  <fi...fu|a|£J...££>,  the  product  being  another 
single-row  matrix,  which  is  just  the  representative  of  <Q|a.  The 
single-row  matrix  representing  <Q|  may  be  multiplied  on  the  right 
by  the  single-column  matrix  representing  |P>,  the  product  being  a 
matrix  with  just  a single  element,  which  is  equal  to  <Q|P>.  Finally, 
the  single-row  matrix  representing  <Q|  may  be  multiplied  on  the  left 
by  the  single-column  matrix  representing  |P>,  the  product  being  a 
square  matrix,  which  is  just  the  representative  of  |P><#|.  In  this 
way  all  our  abstract  symbols,  linear  operators,  bra  vectors,  and  ket 
vectors,  can  be  represented  by  matrices,  which  are  subject  to  the 
same  algebraic  relations  as  the  abstract  symbols  themselves. 


18.  Probability  amplitudes 

Representations  are  of  great  importance  in  the  physical  interpreta- 
tion of  quantum  mechanics  as  they  provide  a convenient  method  for 
obtaining  the  probabilities  of  observables  having  given  values.  In 
§ 12  we  obtained  the  probability  of  an  observable  having  any  speci- 
fied value  for  a given  state  and  in  § 13  we  generalized  this  result 
and  obtained  the  probability  of  a set  of  commuting  observables 
simultaneously  having  specified  values  for  a given  state.  Let  us  now 
apply  this  result  to  a complete  set  of  commuting  observables,  say  the 
set  of  fs  which  we  have  been  dealing  with  already.  According  to 
formula  (51)  of  § 13,  the  probability  of  each  £r  having  the  value 
for  the  state  corresponding  to  the  normalized  ket  vector  |af>  is 

(50), 
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If  the  £’s  all  have  discrete  eigenvalues,  we  can  use  (35)  with  v — u 
and  no  integrals,  and  get 

JS . <*  iSf» « Sf  i 51  - - -^»>  <^i-  - i-c> 

(IJ'u  


= Kfi-f'J*)!2-  (51) 

We  thus  get  the  simple  result  that  the  probability  of  the  £ s having  the 
values  £'  is  just  the  square  of  the  modulus  of  the  appropriate  coordinate 
of  the  normalized  ket  vector  corresponding  to  the  state  concerned. 

If  the  f’s  do  not  all  have  discrete  eigenvalues,  but  if,  say, 
have  discrete  eigenvalues  and  £e+1,..,  £„  have  continuous  eigenvalues,, 
then  to  get  something  physically  significant  we  must  obtain  the 
probability  of  each  fT  ( r = 1,..,  v)  having  a specified  value  £r  and  each 
(s  = v+l,..,u)  lying  in  a specified  small  range  to  lor 

this  purpose  we  must  replace  each  factor  8^;  in  (50)  by  a factor 
which  is  that  function  of  the  observable  £,  which  is  equal  to  unity 
for  £,  within  the  range  £ to  ft+dft  and  zero  otherwise.  Proceeding 
as  before  with  the  help  of  (35),  we  obtain  for  this  probability 


v-dfu  = |<fi...r«l*>l ad&+1..d&.  (52) 

Thus  in  every  case  the  probability  distribution  of  values  for  the  f s is 
given  by  the  square  of  the  modulus  of  the  representative  of  the  norma- 
lized ket  vector  corresponding  to  the  state  concerned. 

The  numbers  which  form  the  representative  of  a normalized  ket 
(or  bra)  may  for  this  reason  be  called  probability  amplitudes.  1 he 
square  of  the  modulus  of  a probability  amplitude  is  an  ordinary 
probability,  or  a probability  per  unit  range  for  those  variables  that 
have  continuous  ranges  of  values. 

We  may  be  interested  in  a state  whose  corresponding  ket  jar)  cannot 
be  normalized.  This  occurs,  for  example,  if  the  state  is  an  eigenstate 
of  some  observable  belonging  to  an  eigenvalue  lying  in  a range  of 
eigenvalues.  The  formula  (51)  or  (52)  can  then  still  be  used  to  give 
the  relative  probability  of  the  £’s  having  specified  values  or  having 
values  lying  in  specified  small  ranges,  i.e.  it  will  give  correctly  the 
ratios  of  the  probabilities  for  different  £”s.  The  numbers  ■■£,!*> 
may  then  be  called  relative  probability  amplitudes. 
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The  representation  for  which  the  above  results  hold  is  characterized 
by  the  basic  vectors  being  simultaneous  eigenvectors  of  all  the  £’s. 
It  may  also  be  characterized  by  the  requirement  that  each  of  the  £’s 
shall  be  represented  by  a diagonal  matrix,  this  condition  being  easily 
seen  to  be  equivalent  to  the  previous  one.  The  latter  characterization 
is  usually  the  more  convenient  one.  For  brevity,  we  shall  formulate 
it  as  each  of  the  £’s  * being  diagonal  in  the  representation’. 

Provided  the  $’s  form  a complete,  set  of  commuting  observables, 
the  representation  is  completely  determined  by  the  characterization, 
apart  from.afbitrary  phase  factors  in  the  basic  vectors.  Each  basic  bra 
may  be  multiplied  by  e<r',  where  y is  any  real  function  of 
the  variables  £x,.. . £u,  without  changing  any  of  the  conditions  which 
the  representation  has  to  satisfy,  i.e.  the  condition  that  the  f’s  are 
diagonal  or  that  the  basic  vectors  are  simultaneous  eigenvectors  of 
the  f s,  and  the  fundamental  properties  of  the  basic  vectors  (34)  and 
(36).  With  the  basic  bras  changed  in  this  way,  the  representative 
\£i“-£ul-P^  of  a ket  |P>  gets  multiplied  by  efr”,  the  representative 
<<?lfr  - fu>  °f  a bra  <©]  gets  multiplied  by  e-*V  and  the  representa- 
tive <&...£,  |a|£...£;>  of  a linear  operator  a gets  multiplied  by  e«r'-r">. 
The  probabilities  or  relative  probabilities  (61),  (52)  are,  of  course, 
unaltered. 

The  probabilities  that  one  calculates  in  practical  problems  in 
quantum  mechanics  are  nearly  always  obtained  from  the  squares 
of  the  moduli  of  probability  amplitudes  or  relative  probability  ampli- 
tudes. Even  when  one  is  interested  only  in  the  probability  of  an 
incomplete  set  of  commuting  observables  having  specified  values,  it 
is  usually  necessary  first  to  make  the  set  a complete  one  by  the 
introduction  of  some  extra  commuting  observables  and  to  obtain 
the  probability  of  the  complete  set  having  specified  values  (as  the 
square  of  the  modulus  of  a probability  amplitude),  and  then  to  sum 
or  integrate  over  all  possible  values  of  the  extra  observables.  A 
more  direct  application  of  formula  (61)  of  § 13  is  usually  not 
practicable. 

To  introduce  a representation  in  practice 

(i)  We  look  for  observables  which  we  would  like  to  have  diagonal, 
either  because  we  are  interested  in  their  probabilities  or  for 
reasons  of  mathematical  simplicity ; 

(ii)  We  must  see  that  they  all  commute — a necessary  condition 
since  diagonal  matrices  always  commute ; 
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(iii)  We  then  sec  that  they  form  a complete  commuting  set,  and 
if  not  we  add  some  more  commuting  observables  to  them  to 
make  them  into  a complete  commuting  set ; 

(iv)  We  set  up  an  orthogonal  representation  with  this  comp  t. 

commuting  set  diagonal.  f , 

The  representation  is  then  completely  determined  except  for  the 

arbitrary  phase  factors.  For  most  purposes  the  arbitrary  phase 
factors  are  unimportant  and  trivial,  so  that  we  may  count .tte 
representation  as  being  completely  determined  by  the  observa  ^ 
that  are  diagonal  in  it.  This  fact  is  already  implied  m our  notation, 
since  the  onlv  indication  in  a representative  of  the  representation  to 
which  it  belongs  are  the  letters  denoting  the  observables  that  are 

^TtTay  be  that  we  are  interested  in  two  representations  for  the 

— ■ rT:%£7% 

the  other  the  complete  set  of  commuting 
observables  Vv...,v,„  are  diagonal  and  the  basic  bras  are  ^^...^ 
A ket  | py  will  now  have  the  two  representatives  ‘^r-^l^ 

/ ’ ' ! p\  If  e . £ have  discrete  eigenvalues  and  £„+!,■•,  s«  have 

continuous  eigenvalues  and  if  v*  have  discrete  eigenvalues  am 
r)x+1,..,  7]u.  have  continuous  eigenvalues,  we  get  from  (55) 

Wl-Vw |iJ>  = 2.  {-J  (£v£u\  (,i3) 

and  interchanging  f s and  rj’s 

-=  2 f-f  dVx+i- -dVw  >•  ^ 

These  are  the  transformation  equations  which  give  one  representative 
of  IP)  in  terms  of  the  other.  They  show  that  either  representative 
is  expressible  linearly  in  terms  of  the  other,  with  the  quanta  mss 


<£i—£ulVv--Viv) 

as  coefficients.  These  quantities  are  called  the  transformation  func- 
tions. Similar  equations  may  be  written  down  to  connect  the  two 
representatives  of  a bra  vector  or  of  a linear  operator.  The  trans- 
formation functions  (55)  are  in  every  case  the  means  which  enable 
one  to  pass  from  one  representative  to  the  other.  Each  of  the 
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transformation  functions  is  the  conjugate  complex  of  the  other,  and 
they  satisfy  the  conditions 


d£v+i  -d£u  (ii—fulvI—Vv) 
\Wi"^vWr^rtx+i  rlx+i)--H‘rju>~1lw ) 


(56) 


and  the  corresponding  conditions  with  fa  and  fa  interchanged  as 

may  be  verified  from  (35)  and  (34)  and  the  corresponding  equations 
lor  the  77 ’s. 

Transformation  functions  are  examples  of  probability  amplitudes 
or  relative  probability  amplitudes.  Let  us  take  the  case  when  all  the 
f * and  a11  the  V’s  have  discrete  eigenvalues.  Then  the  basic  ket 
I Vv  Vw>  is  normalized,  so  that  its  representative  in  the  f-representa- 
ti°n,  <£i-£u\Vi-Vw>,  is  a probability  amplitude  for  each  set  of  values 
for  the  f’s.  The  state  to  which  these  probability  amplitudes  refer 
namely  the  state  corresponding  to  |^...^>,  is  characterized  by  the 

condition  that  a simultaneous  measurement  of  Vu)  is  certain  to 

lead  to  the  results  f,...,fw.  Thus  \<fv..fu\fl...fwy\-  i8  the  proba- 
bility of  the  £ s having  the  values  £...£,  for  the  state  for  which  the 
V 8 certainly  have  the  values  fv..fw.  Since 

we  have  the  theorem  of  reciprocity— the  probability  of  the  fa  having 
the  values  f for  the  state  for  which  the  fs  certainly  have  the  values  f 
%s equal  to  the  probability  of  the  fs  having  the  values  f for  the  state  for 
which  the  fs  certainly  have  the  values  f. 

If  all  the  fa  have  discrete  eigenvalues  and  some  of  the  fa  have 
continuous  eigenvalues,  still  gives  the  probability 

distribution  of  values  for  the  fa  for  the  state  for  which  the  fa  cer- 
tainly have  the  values  f.  If  some  of  the  fs  have  continuous  eigen- 
va  ues  \v  Vwy  is  not  normalized  and  then  gives 

only  the  relative  probability  distribution  of  values  for  the  fa  for  the 
state  for  which  the  fs  certainly  have  the  values  f. 

19.  Theorems  about  functions  of  observables 

We  shall  illustrate  the  mathematical  value  of  representations  by 
using  them  to  prove  some  theorems. 

Theorem  1.  A linear  operator  that  commutes  with  an  observable  £ 
commutes  also  with  any  function  of 

The  theorem  is  obviously  true  when  the  function  is  expressible  as 
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a power  series.  To  prove  it  generally,  let  to  be  the  linear  operator, 
so  that  we  have  the  equation 

£a>-a»£  = 0.  (57) 

Let  us  introduce  a representation  in  which  £ is  diagonal.  If  £ by 
itself  does  not  form  a complete  commuting  set  of  observables,  we  must 
make  it  into  a complete  commuting  set  by  adding  certain  observables, 
p say)  to  it,  and  then  take  the  representation  in  which  £ and  the  p’s 
are  diagonal.  (The  case  when  £ does  form  a complete  commuting  set 
by  itself  can  be  looked  upon  as  a special  case  of  the  preceding  one 
with  the  number  of  p variables  zero.)  In  this  representation  equation 

(57)  becomes  tfPlfa-wtlFP’y  = 0, 

which  reduces  to 

f<f/n*>r/o -<£'/*>  ir/or  = «. 

In  the  case  when  the  eigenvalues  of  £ are  discrete,  this  equation 
shows  that  all  the  matrix  elements  of  to  vanish  except 

those  for  which  £'  = £'.  In  the  case  when  the  eigenvalues  of  £ are 
continuous  it  shows,  like  equation  (48),  that  <£  P |<o|£  p ) is  of  the 

form  <fj3>|£T>  - cilf-n 

where  c is  some  function  of  £’  and  the  P” s and  p"  s.  In  either  case 
we  may  say  that  the  matrix  representing  to  * is  diagonal  with  respect 
to  £’.  If /(£)  denotes  any  function  of  £ in  accordance  with  the  general 
theory  of  § 1 1,  which  requires /(£"')  to  be  defined  for  £"'  any  eigenvalue 
of  £,  we  can  deduce  in  either  case 

ttt'Kt'P'  I to  l£T>-  0'  I"  ir/3')/(£')  = 0. 

This  gives  <£')S'|/(£)to— to/(£)|£',^'r>  = 0, 

so  that  /(£)  to— to/(£)  = 0 ^ 

and  the  theorem  is  proved. 

As  a special  case  of  the  theorem,  we  have  the  result  that  any 
observable  that  commutes  with  an  observable  £ also  commutes  with 
any  function  of  £.  This  result  appears  as  a physical  necessity  when 
we  identify,  as  in  § 13,  the  condition  of  commutability  of  two 
observables  with  the  condition  of  compatibility  of  the  correspond- 
ing observations.  Any  observation  that  is  compatible  with  the 
measurement  of  an  observable  £ must  also  be  compatible  with  the 
measurement  of  /(£),  since  any  measurement  of  £ includes  in  itself 
a measurement  of/(£). 
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Theorem  2.  A linear  operator  that  commutes  with  each  of  a complete 
set  of  commuting  observables  is  a function  of  those  observables. 

Let  to  be  the  linear  operator  and  £i,  £%,—,£»  the  complete  set  of 
commuting  observables,  and  set  up  a representation  with  these 
observables  diagonal.  Since  to  commutes  with  each  of  the  £’s,  the 
matrix  representing  it  is  diagonal  with  respect  to  each  of  the  £’.s, 
by  the  argument  we  had  above.  This  matrix  is  therefore  a diagonal 
matrix  and  is  of  the  fonn  (49),  involving  a number  c'  which  is  a 
function  of  the  £”s.  It  thus  represents  the  function  of  the  £'s  that 
c'  is  of  the  f’e,  and  hence  to  equals  this  function  of  the  fa. 

Theorem  3.  If  an  observable  £ and  a linear  operator  g are  such  that 
any  linear  operator  that  commutes  with  £ also  commutes  with  g,  tke.n  g 
is  a function  of  £. 

This  is  the  converse  of  Theorem  1 . To  prove  it,  we  use  the  same 
representation  with  £ diagonal  as  we  had  for  Theorem  1.  In  the  first 
place,  we  see  that  g must  commute  with  £ itself,  and  hence  the 
representative  of  g must  be  diagonal  with  respect  to  £,  i.e.  it  must 
be  of  the  form 

<^W=«(fW{r  or  a(fj3'/?')8(f-D, 

according  to  whether  £ has  discrete  or  continuous  eigenvalues.  Now 
let  to  be  any  linear  operator  that  commutes  with  £,  so  that  its 
representative  is  of  the  form 

~ or  bm'W'-n- 

By  hypothesis  to  must  also  commute  with  g,  so  that 

<fj8'|grto-tog;rr>  = 0.  (58) 

If  we  suppose  for  definiteness  that  the  /3\s  have  discrete  eigenvalues, 
(58)  leads,  with  the  help  of  the  law  of  matrix  multiplication,  to 

2 {a(£'p'nfAern~>>(t'p'na(t'F"n}  = o,  (59) 

pm 

the  left-hand  side  of  (58)  being  equal  to  the  left-hand  side  of  (59) 
multiplied  by  Sff.  or  %(£'—£”)■  Equation  (59)  must  hold  for  all 
functions  b{£’fi'fi’).  We  can  deduce  that 

a(£'prn  = 0 for  0'  # 0', 

«(f jsvn  = «(fm. 

The  first  of  these  results  shows  that  the  matrix  representing  g is 
diagonal  and  the  second  shows  that  a(£'f}'fi')  is  a function  of  £'  only. 
We  can  now  infer  that  g is  that  function  of  £ which  a{£'fi'fi')  is  of  £', 
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so  Die  theorem  is  proved.  The  proof  is  analogous  if  some  of  the  S’ s 
have  continuous  eigenvalues. 

Theorems  1 and  3 are  still  valid  if  we  replace  the  observable  £ by 
any  set  of  commuting  observables  only  formal  changes 

being  needed  in  the  proofs. 


20.  Developments  in  notation 

The  theory  of  representations  that  we  have  developed  provides  a 
general  system  for  labelling  kets  and  bras.  In  a representation  in  which 
the  complete  set  of  commuting  observables  are  diagonal  any 

ket  |P>  will  have  a representative  |P>,  or  <£'|P>  for  brevity. 

This  representative  is  a definite  function  of  the  variables  f,  say  </>(£')• 
The  function  <p  then  determines  the  ket  |P>  completely,  so  it  may  be 
used  to  label  this  ket,  to  replace  the  arbitrary  label  P.  In  symbols, 

if  <fip>  = m 

we  put  jP>  = |vH£)>. 

We  must  put  |P>  equal  to  | ^(|)>  and  not  \i/j(f)},  since  it  does  not 
depend  on  a particular  set  of  eigenvalues  for  the  fa,  but  only  on  the 
form  of  the  function  <p. 

With  f(i)  any  function  of  the  observables  f,...,  £u,  f(f)\P)  will 
have  as  its  representative 


(CO) 


<nm\p>  =mw)- 

Thus  according  to  (60)  we  put 

m\p>  = \mm>- 

With  the  help  of  the  second  of  equations  (60)  we  now  get 

nmm  = 1 mm>-  («> 

This  is  a general  result  holding  for  any  functions  / and  ip  of  the  fa, 
and  it  shows  that  the  vertical  line  | is  not  necessary  with  the  new 
notation  for  a ket — either  side  of  (61)  may  be  written  simply  as 
/(£)</>(£)).  Thus  the  rule  for  the  new  notation  becomes: — 


if  <f|p>=»#n 

we  put  jP>  = 

We  may  further  shorten  i p(f)}  to  >]>) , leaving  the  variables  £ under- 
stood, if  no  ambiguity  arises  thereby. 

The  ket  i//(f)>  may  be  considered  as  the  product  of  the  linear 
operator  >p(f  with  a ket  which  is  denoted  simply  by  > without  a 
label.  We  call  Die  ket  > the  standard  ket.  Any  ket  whatever  can  be 
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expressed  as  a function  of  the  f . multiplied  into  the  standard  ket 
hor  example,  taking  |P>  in  (62)  to  be  the  basic  ket  |0,  we  find 

,r>  = (03) 

m the  case  when  £lf„,  f,  have  discrete  eigenvalues  and  £ have 
continuous  eigenvalues.  The  standard  ket  is  characterized  by  the 
condition  that  its  representative  <f  |>  is  unity  over  the  whole  domain 
of  the  variable  £ , as  may  be  seen  by  putting  xf,  = l in  (62). 

further  contraction  may  be  made  in  the  notation,  namely  to 
leave  the  symbol  > for  the  standard  ket  understood.  A ket  is  then 
written  Simpiy  ^ *(fl,  a function  of  the  observables  f A function 
of  the  f b used  m this  way  to  denote  a ket  is  called  a wave  function* 
The  system  of  notation  provided  by  wave  functions  is  the  one  usually 
used  by  most  authors  for  calculations  in  quantum  mechanics  In 
usrng  it  one  should  remember  that  each  wave  function  is  understood 
to  have  the  standard  ket  multiplied  into  it  on  the  right,  which 
prevents  one  from  multiplying  the  wave  function  by  any  operator 

°the  Mt  “tv  ' ,r  ?Ve  fanCll°nS  mn  be  m»MpUed  by  operator s only  on 
f.  his  distinguishes  them  from  ordinary  functions  of  the  f’a 

wh«h  »to  operators  and  can  be  multiplied  by  operator,  on  either  the 
or  the  right.  A wave  function  is  just  the  representative  of  a ket 
expressed  as  a function  of  the  observables  f.  instead  of  eigenvalues  f 
r those  observables.  The  square  of  its  modulus  gives  the  proba- 
bility (or  the  relative  probability,  if  it  is  not  normalized)  of  the  f ’s 

avmg  specified  values,  or  lying  in  specified  small  ranges,  for  the 
corresponding  state.  6 

f0r  bras  may  1)6  devel°Ped  in  the  same  way  as 
I "..!'™  <<?l  Wh°Se  representative  <Q|f > is  tff)  we  write 
' tJ  f 1,8  notat,on  the  conjugate  imaginary  to  is 

. 1 ms  tho  rule  t,iat  we  liave  used  hitherto,  that  a ket  and 
its  conjugate  imaginary  bra  are  both  specified  by  the  same  label, 
must  be  extended  to  read-*/  the  lab  els  of  a ket  involve  complex 
numbers  or  complex  functions,  the  ***  of  the  conjuga(e  imagi* 

bra  involve  the  conjugate  complex  numbers  or  functions.  As  in  the 

r We **  <m /(£)  and  <*(f)/(£)l  are  the  same, 
so  that  the  vertical  line  can  be  omitted.  We  can  consider  <*(f)  as 

the  product  of  the  lrnear  operator  into  the  standard  bra  <,  which 

+ c xi  • . . . 
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is  the  conjugate  imaginary  of  the  standard  ket  >.  We  may  leave 
the  standard  bra  understood,  so  that  a general  bra  is  written  as 
the  conjugate  complex  of  a wave  function.  The  conjugate  complex 
of  a wave  function  can  be  multiplied  by  any  linear  operator  on  the 
right,  but  cannot  be  multiplied  by  a linear  operator  on  the  left.  We 
can  construct  triple  products  of  the  form  </(f  )>.  Such  a triple  product 
is  a number,  equal  to  /(£)  summed  or  integrated  over  the  whole 
domain  of  eigenvalues  for  the  £’s, 

<m>  = i,  f~J  m ..<%  (64) 

in  the  case  when  gv..,  £v  have  discrete  eigenvalues  and  £r+1,...,  £u  have 
continuous  eigenvalues. 

The  standard  ket  and  bra  are  defined  with  respect  to  a representa- 
tion. If  we  earned  through  the  above  work  with  a different  repre- 
sentation in  which  the  complete  set  of  commuting  observables  -rj  are 
diagonal,  or  if  we  merely  changed  the  phase  factors  in  the  representa- 
tion with  the  fs  diagonal,  we  should  get  a different  standard  ket  and 
bra.  In  a piece  of  work  in  which  more  than  one  standard  ket  or  bra 
appears  one  must,  of  course,  distinguish  them  by  giving  them  labels. 

A further  development  of  the  notation  which  is  of  great  importance 
for  dealing  with  complicated  dynamical  systems  will  now  be  discussed. 
Suppose  we  have  a dynamical  system  describable  in  terms  of  dynami- 
cal variables  which  can  all  be  divided  into  two  sets,  set  A and  set  B 
say,  such  that  any  member  of  set  A commutes  with  any  member  of 
set  B.  A general  dynamical  variable  must  be  expressible  as  a function 
of  the  A-variables  and  A -variables  together.  We  may  consider 
another  dynamical  system  in  which  the  dynamical  variables  are  the 
A -variables  only — let  us  call  it  the  A -system.  Similarly  we  may 
consider  a third  dynamical  system  in  which  the  dynamical  variables 
are  the  /1-variables  only — the  B-system.  The  original  system  can 
then  be  looked  - upon  as  a combination  of  the  A -system  and  the 
B-system  in  accordance  with  the  mathematical  scheme  given  below. 

Let  us  take  any  ket  |«>  for  the  A -system  and  any  ket  |6>  for  the 
Z? -system.  We  assume  that  they  have  a product  |a>|6>  for  which 
the  commutative  and  distributive  axioms  of  multiplication  hold,  i.e. 

|a>|6>  = )6>|a>,  ' 

{cilai>+c2la2>}l^>  = ciiar>l^>-fcal®2>l^>, 
l«)fciifei>-f-c2l*2>}  = q|a>l&i>+c2|a>|&2>, 
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the  c’s  being  numbers.  We  can  give  a meaning  to  any  .d- variable 
operating  on  the  produ  „ | a)!6>  by  assuming  that  it  operates  only 
on  the  |a>  factor  and  commutes  with  the  j&>  factor,  and  similarly 
we  can  give  a meaning  to  any  //-variable  operating  on  this  product 
by  assuming  that  it  operates  only  on  the  |6>  factor  and  commutes 
with  the  j a)  factor.  (This  makes  every  A -variable  commute  with 
every  //-variable.)  Thus  any  dynamical  variable  of  the  original 
system  can  operate  on  the  product  jayii),  so  this  product  can  be 
looked  upon  as  a ket  for  the  original  system,  and  may  then  be 
written  \aby,  the  two  labels  a and  b being  sufficient  to  specify  it. 
In  this  way  we  get  the  fundamental  equations 

|a>|6>  = )&>!«>  = | aby.  (65) 

The  multiplication  here  is  of  quite  a different  kind  from  any  that 
occurs  earlier  in  the  theory.  The  ket  vectors  ja>  and  |6>  are  in  two 
different  vector  spaces  and  their  product  is  in  a third  vector  space, 
which  may  be  called  the  product  of  the  two  previous  vector  spaces. 
Ihe  number  of  dimensions  of  the  product  space  is  equal  to  the 
product  of  the  number  of  dimensions  of  each  of  the  factor  spaces. 
A general  ket  vector  of  the  product  space  is  not  of  the  form  (65),  but 
is  a sum  or  integral  of  kets  of  this  form. 

Let  us  take  a representation  for  the  A -system  in  which  a complete 
set  of  commuting  observables  £A  of  the  A -system  are  diagonal.  We 
shall  then  have  the  basic  bras  {£A  j for  the  A -system.  Similarly,  taking 
a representation  for  the  //-system  with  the  observables  £ B diagonal, 
we  shall  have  the  basic  bras  <£'e\  for  the  //-system.  The  products 

<fj\<ie  (66) 

will  then  provide  the  basic  bras  for  a representation  for  the  original 
system,  in  which  representation  the  £A’s  and  the  £b’s  will  be  diagonal. 
The  £As  and  £b’b  will  together  form  a complete  set  of  commuting 
observables  for  the  original  system.  From  (65)  and  (66)  we  get 

<&l«X&j6>  - (67) 

showing  that  the  representative  of  \ab}  equals  the  product  of  the 
representatives  of  | a)  and  of  }6>  in  their  respective  representations. 

We  can  introduce  the  standard  ket,  )>A  say,  for  the  .<4 -system, 
with  respect  to  the  representation  with  the  £ab  diagonal,  and  also 
the  standard  ket  )B  for  the  5-system,  with  respect  to  the  repre- 
sentation with  the  £bs  diagonal.  Their  product  }A>B  is  then  the 
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standard  ket  for  the  original  system,  with  respect  to  the  representa- 
tion with  the  ^'s  and  s diagonal.  Any  ket  for  the  original  system 
may  be  expressed  as  #£,  &,)>„  >B.  * (68) 

It  may  be  that  in  a certain  calculation  we  wish  to  use  a particular 
representation  for  the  B-system,  say  the  above  representation  with 
the  £B’s  diagonal,  but  do  not  wish  to  introduce  any  particular 
representation  for  the  A -system.  It  would  then  be  convenient  to 
use  the  standard  ket  >B  for  the  Zf -system  and  no  standard  ket  for 
the  A -system.  Under  these  circumstances  we  could  write  any  ket 
for  the  original  system  as  |£B»B  (6b) 

in  which  |£B>  is  a ket  for  the  A -system  and  is  also  a function  of  the 
fB’s,  i.e.  it  is  a ket  for  the  A-system  for  each  set  of  values  for  the 
gfj's — in  fact  (69)  equals  (68)  if  we  take 

lfB>  = 'I'(£a£b)>a- 

We  may  leave  the  standard  ket  >B  in  (69)  understood,  and  then  we 
have  the  general  ket  for  the  original  system  appearing  as  |fB>,  a ket 
for  the  A-sy3tem  and  a wave  function  in  the  variables  £B  of  the 
B-system.  An  example  of  this  notation  will  be  used  in  § 66. 

The  above  work  can  be  immediately  extended  to  a dynamical 
system  describable  in  terms  of  dynamical  variables  which  can  be 
divided  into  three  or  more  sets  A.B,C....  such  that  any  member  of 
one  set  commutes  with  any  member  of  another.  Equation  (65)  gets 
generalized  to  |0>|6>|$...  = \abc...S, 

the  factors  on  the  left  being  kets  for  the  component  systems  and 
the  ket  on  the  right  being  a ket  for  the  original  system.  Equations 
(66),  (67),  and  (68)  get  generalized  to  many  factors  in  a similar  way. 
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THE  QUANTUM  CONDITIONS 

21.  Poisson  brackets 

Our  work  so  far  has  consisted  in  setting  up  a general  mathematical 
scheme  connecting  states  and  observables  in  quantum  mechanics. 
One  of  the  dominant  features  of  this  scheme  is  that  observables,  and 
dynamical  variables  in  general,  appear  in  it  as  quantities  which  do 
not  obey  the  commutative  law'  of  multiplication.  It  now  becomes 
necessary  for  us  to  obtain  equations  to  replace  the  commutative  law 
of  multiplication,  equations  that  will  tell  us  the  value  of  £ 17—  ^ when 
£ and  7]  are  any  two  observables  or  dynamical  variables.  Only  when 
such  equations  are  known  shall  we  have  a complete  scheme  of 
mechanics  with  which  to  replace  classical  mechanics.  Ihese  new 
equations  are  called  quantum  conditions  or  commutation  relations. 

The  problem  of  finding  quantum  conditions  is  not  of  such  a general 
character  as  those  we  have  been  concerned  with  up  to  the  present.  It 
is  instead  a special  problem  which  presents  itself  with  each  particular 
dynamical  system  one  is  called  upon  to  study.  There  is,  however, 
a fairly  general  method  of  obtaining  quantum  conditions,  applicable 
to  a very  large  class  of  dynamical  systems.  This  is  the  method  of 
classical  analogy  and  will  form  the  main  theme  of  the  present  chapter. 
Those  dynamical  systems  to  which  this  method  is  not  applicable 
must  be  treated  individually  and  special  considerations  used  in  each 
case. 

The  value  of  classical  analogy  in  the  development  of  quantum 
mechanics  depends  on  the  fact  that  classical  mechanics  provides  a 
valid  description  of  dynamical  systems  under  certain  conditions, 
when  the  particles  and  bodies  composing  the  systems  are  sufficiently 
massive  for  the  disturbance  accompanying  an  observation  to  be 
negligible.  Classical  mechanics  must  therefore  be  a limiting  case  of 
quantum  mechanics.  We  should  thus  expect  to  find  that  important 
concepts  in  classical  mechanics  correspond  to  important  concepts  in 
quantum  mechanics,  and,  from  an  understanding  of  the  general 
nature  of  the  analogy  between  classical  and  quantum  mechanics,  w'e 
may  hope  to  get  laws  and  theorems  in  quantum  mechanics  appearing 
as  simple  generalizations  of  well-known  results  in  classical  mechanics  ; 
in  particular  we  may  hope  to  get  the  quantum  conditions  appearing 
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„ a simple  genernlitation  of  the  classical  law  that  all  dynamical 

^Let  us  takeT  dynamical  system  composed  of  a number  of  particles 
in  interaction.  A.  independent  dynamical  variables  for 
the  system  we  may  use  the  Cartesian  coordinates  o all  the 
and  the  corresponding  Cartesian  components  of  velocity  of  the  pa 
“1  It  is,  how.verr.nore  convenient  to  work  with  the  momentum 
components  instead  of  the  velocity  components.  Let  us end 
coordinates  r going  from  1 to  three  t.mes  the 

and  the  corresponding  momentum  components  V,  The  q s and  p 

are  called  canonical  coordinates  and  momenta.  introdu- 

The  method  of  Lagrange’s  equations  of  motmn  involves mtrodu 
cine  coordinates  q,  and  momenta  pr  in  a more  general  way,  applicable 
also  for  a svstem It  composed  of  particles  (e.g.  a system  containing 
rigid  bodies).  These  more  general  q’s  and  p’s  are  also  called  canonical 
coordinates  and  momenta.  Any  dynamical  variable  is  expressible 
terms  of  a set  of  canonical  coordinates  and  momenta. 

An  important  concept  in  general  dynamical  theory J is ‘he  £»««» 
Bracket.  Any  two  dynamical  variables  u and  v av 
Bracket)  which  we  shall  denote  by  define  y 


jdu  dv  du  dv\ 
[u, «]  = 2. \dqrdpr  VPr  ’ 


„ and  v being  regarded  as  functions  of  a set  of  canonical  coordinates 
Id  mol,,!  J.nd  * for  the  purpose  of  the  dW.rent.at,^ 
rfoht-hand  side  of  (l)  is  independent  of  which  set  of  canonical 
coordinates  and  momenta  are  used,  this  being  a consequence  oi  the 
general  definition  of  canonical  coordinates  and  momenta, 

pr  [m  v \ is  well  defined.  . 

The  main  properties  of  P.B.s,  which  follow  at  once  from  their 

definition  (1),  are 

[«,»]  = -[»,«],  K) 

[m,c]  = 0,  (3) 

where  c is  a number  (which  may  be  considered  as  a special  case  of  a 
dynamical  variable), 

[M!  + W2,W]  = K'  + \ (4) 


3595.57 
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= [«i,W>2+«i[«2.w]. 

[«>  «1  «2]  = [«,» l]»2+®l[“»  «»]■ 

Also  the  identity 


[m,  [v,  w»]]+[«,  [w,  w]]+[u>,  [u,  w]]  = 0 (6) 

is  easily  verified.  Equations  (4)  express  that  the  P.B.  [u,  v]  involves 
u and  v linearly,  while  equations  (5)  correspond  to  the  ordinary  rules 
for  differentiating  a product. 

Let  us  try  to  introduce  a quantum  P.B.  which  shall  be  the  analogue 
of  the  classical  one.  We  assume  the  quantum  P.B.  to  satisfy  all  the 
conditions  (2)  to  (6),  it  being  now  necessary  that  the  order  of  the 
factors  ux  and  u2  in  the  first  of  equations  (5)  should  be  preserved 
throughout  the  equation,  as  in  the  way  we  have  here  written  it,  and 
similarly  for  the  vl  and  v2  in  the  second  of  equations  (5).  These  condi- 
tions are  already  sufficient  to  determine  the  form  of  the  quantum 
P.B.  uniquely,  as  may  be  seen  from  the  following  argument.  We  can 
evaluate  the  P.B.  [ 2]  in  two  different  ways,  since  we  can  use 
either  of  the  two  formulas  (5)  first,  thus, 

= {K>  VlK+WlK>  ^]K+Ml{[«2.  Vl]V2+Vl[U2,  V2]} 

= [Ml>  Vl\V2  «2  + *YK  + Wi[M2-  «lK  + “l  Vl[U2’  Vi\ 

and 

K = [u^v^vz  + v^  u2,v2] 

— K>  Vl}U2  «2  + «l[«2.  V2]U2  + V x U^,  %]. 

Equating  these  two  results,  we  obtain 


[uVvl\{u2v2  — viui)  = («1 Vl  — V1  «i)[m2,  V%\. 

Since  this  condition  holds  with  ul  and  vt  quite  independent  of  u2  and 
Vo,  we  must  have  .,r  n 

uivi~ viui  — *S[«i,«i], 

u2v2—v2u2  = »S[«2,w2], 

where  h must  not  depend  on  iq  and  vlt  nor  on  u2  and  v2,  and  also 
must  commute  with  (u1  vx — v1u1).  It  follows  that  h must  be  simply 
a number.  We  want  the  P.B.  of  two  real  variables  to  be  real,  as  in 
the  classical  theory,  which  requires  from  the  work  at  the  top  of  p.  28, 
that  h shall  be  a real  number  when  introduced,  as  here,  with  the 
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coefficient  i.  We  are  thus  led  to  the  following  definition  for  the. 
quantum  P.B.  [u,  w]  of  any  two  variables  u and  v, 

uv  — vu  = ih[u,v],  ^ 

in  which  h is  a new  universal  constant.  It  has  the  dimensions  of 
action  In  order  that  the  theory  may  agree  with  experiment,  we 
must  take  ft  equal  to  ft/8r.  where  ft 

was  introduced  by  Planck,  known  as  Planck  s constant.  It  » 
verified  that  the  quantum  P.B.  satisfies  all  the  conditions  (2),  (3),  ( ), 

( 5 The'problem  of  finding  quantum  conditions  now  reduces  to  the 
problem  of  determining  P.B.s  in  quantum  mechanics  The  stro  g 
analogy  between  the  quantum  P.B.  defined  by  (7)  and  the  classical 
P B defined  by  ( 1)  leads  us  to  make  the  assumption  that  the  quantum 
PBS  or  at  any  rate  the  simpler  ones  of  them,  have  the  same  values 
•vs  the  corresponding  classical  P.B.s.  The  simplest  P.B.s  are  those 
‘involving  the  canonical  coordinates  and  momenta  themselves  am 
have  the  following  values  in  the  classical  theory . 

[qr,qs]  =.  0,  [pr,p.]  = °>  ] (H) 

f <lr’Ps\  = 8rs- 

We  therefore  assume  that  the  corresponding  quantum  P.B.s  also 
have  the  values  given  by  (8).  By  eliminating  the  quantum  ■ • 
with  the  help  of  (7),  we  obtain  the  equations 

qtq.-M Jr  = 0,  PrPs-PsPr  = J (9) 

qrPs—PsVr  ~ ihbTs< 

which  are  the  fundamental  quantum  conditions. 

the  lack  of  commutability  among  the  canonical  coordinates  and 
, Th  aiSo  provide  us  with  a basis  for  calculating  com 

momenta  lies,  iney  aiso  pioviuc  ^ „ instance 

mutation  relations  between  other  dynamical  variables.  For  > 

f f and  , are  any  two  functions  of  the  q’s  and  p s expressible  as 
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value  of  £r)  — r]£,  as  will  become  clear  from  the  following  work. 
Equations  (9)  thus  give  the  solution  of  the  problem  of  finding  the 
quantum  conditions,  for  all  those  dynamical  systems  which  have  a 
classical  analogue  and  which  are  describable  in  terms  of  canonical 
coordinates  and  momenta.  This  does  not  include  all  possible  systems 
in  quantum  mechanics. 

Equations  (7)  and  (9)  provide  the  foundation  for  the  analogy 
between  quantum  mechanics  and  classical  mechanics.  They  show 
that  classical  mechanics  may  be.regardcd  as  the  limiting  case  of  quantum 
mechanics  when  k tends  to  zero.  A P.B.  in  quantum  mechanics  is  a 
purely  algebr'aic  notion  and  is  thus  a rather  more  fundamental  con- 
cept than  a classical  P.B.,  which  can  be  defined  only  with  reference  to 
a set  of  canonical  coordinates  and  momenta.  For  this  reason  canonical 
coordinates  and  momenta  are  of  less  importance  in  quantum  mechanics 
than  in  classical  mechanics;  in  fact,  we  may  have  a system  in  quan- 
tum mechanics  for  which  canonical  coordinates  and  momenta  do 
not  exist  and  we  can  still  give  a meaning  to  P.B.s.  Such  a system 
would  bo  one  without  a classical  analogue  and  we  should  not  be  able 
to  obtain  its  quantum  conditions  by  the  method  here  described. 

From  equations  (9)  we  see  that  two  variables  with  different  suffixes 
r and  s always  commute.  It  follows  that  any  function  of  qr  and  pr 
will  commute  with  any  function  of  qs  and  ps  when  s differs  from  r. 
Different  values  of  r correspond  to  different  degrees  of  freedom  of  the 
dynamical  system,  so  we  get  the  result  that  dynamical  variables 
referring  to  different  degrees  of  freedom  commute.  This  law,  as  we  have 
derived  it  from  (9),  is  proved  only  for  dynamical  systems  with 
classical  analogues,  but  we  assume  it  to  hold  generally.  In  this  way 
we  can  make  a start  on  the  problem  of  finding  quantum  conditions 
for  dynamical  systems  for  which  canonical  coordinates  and  momenta 
do  not  exist,  provided  we  can  give  a meaning  to  different  degrees  of 
freedom,  as  we  may  bo  able  to  do  with  the  help  of  physical  insight. 

We  can  now  see  the  physical  meaning  of  the  division,  which  was 
discussed  in  the  preceding  section,  of  the  dynamical  variables  into 
sets,  any  member  of  one  set  commuting  with  any  member  of  another. 
Each  set  corresponds  to  certain  degrees  of  freedom,  or  possibly  just 
one  degree  of  freedom.  The  division  may  correspond  to  the  physical 
process  of  resolving  the  dynamical  system  into  its  constituent  parts, 
each  constituent  being  capable  of  existing  by  itself  as  a physical 
system,  and  the  various  constituents  having  to  be  brought  into 
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interaction  with  one  another  to  produce  the  original  system.  AUema- 
tively  the  division'  may  be  merely  a mathematical  procec 
resolving  the  dynamical  system  into  degrees  of  freedo'm  which  canno 
be  separated  physically,  e.g.  the  system  consisting  of  a particlejv  itli 
internal  structure  may  be  divided  into  the  degrees  of  freedom  describ- 
ing the  motion  of  the  centre  of  the  particle  and  those  describing  tt 

internal  structure. 

22.  SchrSdinger’s  representation 

Let  us  consider  a dynamical  system  with  n degrees  of  freedom 
having  a classical  analogue,  and  thus  describable  in  terms  of  canonical 
coordinates  and  momenta  qr,Pr  (r  = 1,2, We  assume  that  the 
coordinates  q,  are  all  observables  and  tat*  continuous  ranges  of  eigen- 
values, these  assumptions  being  reasonable  from  the  physical  sigm 
cance  of  the  q's.  Let  us  set  up  a representation  with  the  q, s diagona 
The  question  arises  whether  the  q's  form  a complete  commuting  set 
for  this  dynamical  system.  It  seems  pretty  obvious  from  inspection 
that  they  do.  We  shall  here  assume  that  they  do,  and  the  assumptioi 
will  be  justified  later  (see  top  of  p.  92).  With  the  7s  forming  a 
complete  commuting  set,  the  representation  is  fixed  except  for  the 

arbitrary  phase  factors  in  it.  „ 

Let  us  consider  first  the  case  of  n = J,  so  that  there  is  only  one  q 

and  p,  satisfying  qp--pq  — W-  (10) 

\nv  ket  mav  be  written  in  the  standard  kct  notation  #/)>•  Yrom  it 
we  can  form  another  kct  #/dg>,  whose  representative  ,s  the  deriva- 
tive of  the  original  one.  This  new  ket  is  a linear  function  of  the 
original  one  and  is  thus  the  result  of  some  linear  operator  applied  to 
the  original  one.  Calling  this  linear  operator  d/dq,  we  have 

(,1) 

dq  dq 

Equation  (11)  holding  for  all  functions  f defines  the  linear  operator 
d'dq.  Wo  have  Q (12) 

dq 

Let  us  treat  the  linear  operator  d/dq  according  to  the  g<  octal  theory 
of  linear  operators  of  § 7.  We  should  then  be  able  to  »t  to  a na 

<*(g),  the  product  ffid/dq  being  defined,  according  to  ( <)  ‘"8  b> 


<4!0> = 
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for  all  functions  >fi(q).  Taking  representatives,  we  get 

{ <+% l<?'>  dq'  *{q)  = / #«')  *4  dHq,) 


dq' 


(14) 


We  can  transform  the  right-hand  side  by  partial  integration  and  get 

J <*%!*>  v m = - J <v  m.  os) 

provided  the  contributions  from  the  limits  of  integration  vanish. 


This  gives 


showing  that 


<*&*> 


<<f> 


dq 


mg') 

dq'  ’ 

_,d4 

' dq' 


(16) 


Thus  djdq  operating  to  the  left  on  the  conjugate  complex  of  a wave 
function  has  the  meaning  of  minus  differentiation  with  respect  to  q. 

The  validity  of  this  result  depends  on  our  being  able  to  make  the 
passage  from  (14)  to  (15),  which  requires  that  we  must  restrict  our- 
selves to  bras  and  kets  corresponding  to  wave  functions  that  satisfy 
suitable  boundary  conditions.  The  conditions  usually  holding  in 
practice  are  that  they  vanish  at  the  boundaries.  (Somewhat  more 
general  conditions  will  be  given  in  the  next  section.)  These  conditions 
do  not  limit  the  physical  applicability  of  the  theory,  but,  on  the  con- 
trary, are  usually  required  also  on  physical  grounds.  For  example, 
if  ? ^ a Cartesian  coordinate  of  a.  particle,  its  eigenvalues  ran  from 
—go  to  co,  and  the  physical  requirement  that  the  particle  has  zero 
probability  of  being  at  infinity  leads  to  the  condition  that  the  wave 
function  vanishes  for  q = -j-oo. 

The  conjugate  complex  of  the  linear  operator  djdq  can  be  evaluated 
by  noting  that  the  conjugate  imaginary  of  djdq. ft > or  di/tjdqj  is 
(dffjdq,  or  —(if, djdq  from  (16).  Thus  the  conjugate  complex  of  djdq 
is  —djdq,  so  djdq  is  a pure  imaginary  linear  operator. 

io  get  the  representative  of  djdq  we  note  that,  from  an  application 
of  formula  (63)  of  § 20, 

\q">  = 8(q-q")},  (17) 


so  that 


and  hence 


<?' 


(18) 


(19) 


The  representative  of  djdq  involves  the  derivative  of  the  S function. 
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Let  us  work  out  the  commutation  relation  connecting  d/dq  with  q. 

(20) 


We  have 


Since  this  holds  for  any  ket  </»>,  we  have 


d d 1 


(21) 

-ihd/dq  satisfies  the 


Comparing  this  result  with  (10),  we  see  that 
same  commutation  relation  with  q that  p does. 

To  extend  the  foregoing  work  to  the  case  of  arbitrary  n,  we  write 
the  general  ket  as  i/>(qv..qn)>  = «/->  and  introduce  the  n linear  opera- 
tors d/dq,  (r  = 1 n),  which  can  operate  on  it  in  accordance  with 

the  formula  a . ^ 


Hr  Hr 


corresponding  to  (11).  We  have 

8 

Hr 


> = 0 


(23) 


corresponding  to  (12).  Provided  we  restrict  ourselves  to  bras  and 
kets  corresponding  to  wave  functions  satisfying  suitable  boundary 
conditions,  these  linear  operators  can  operate  also  on  bras,  in  accor- 
dance with  the  formula  g gj 

<<£ = — <7T-» 

9 Hr  Hr 

corresponding  to  (16).  Thus  d/dq,  can  operate  to  the  left  on  the 
conjugate  complex  of  a wave  function,  when  it  has  the  meaning  of 
minus  partial  differentiation  with  respect  to  qr.  We  find  as  before 
that  each  8jdqr  is  a pure  imaginary  linear  operator.  Corresponding 
to  (21)  we  have  the  commutation  relations 


We  have  further 


8 8 s 

(25) 

Hi >_  *+ 

Hr  Hs  Hr  Ha ' Hs  Hr 

(26) 

8 8 8 8 
Hr  Ha  ~ He  Hr  * 

(27) 

showing  that 

Comparing  (25)  and  (27)  with  (9),  we  see  that  the  linear  operators 

ihd/dqT  satisfy  the  same  commutation  relations  with  the  q’s  and  with 

each  other  that  the  p's  do. 


THE  QUANTUM  CONDITIONS 
It  would  be  possible  to  take 


Pr  = ihdjdqr  (28) 

without  getting  any  inconsistency.  This  possibility  enables  us  to  see 
that  the  q’s  must  form  a complete  commuting  set  of  observables, 
since  it  means  that  any  function  of  the  q’s  and  p’s  could  be  taken 
to  be  a function  of  the  q’s  and  -in  8/dq’s  and  then  could  not  commute 
with  all  the  q’s  unless  it  is  a function  of  the  q’s  only. 

The  equations  (28)  do  not  necessarily  hold.  But  in  any  case  the 
quantities  pr+ihd/8qr  each  commute  with  all  the  q’s,  so  each  of  them 
is  a function.of  the  q’s,  from  Theorem  2 of  § 19.  Thus 

pr  = —ind/dqr+fr(q).  (29) 

Since  pr  and  -ih  c'jdqr  are  both  real,  fr(q)  must  be  real.  For  any 
function  / of  the  q’s  we  have 


%/A> 


showing  that  ^ f—f  S __  g/ 

s9r  cqr  8qr ' 

With  the  help  of  (29)  we  can  now  deduce  the  general  formula 

Prf-fPr  = ~insfldqr. 

This  formula  may  be  written  in  P.B.  notation 


(30) 

(31) 


17-  Pr]  = Sf/8qr,  (32) 

when  it  is  the  same  as  in  the  classical  theory,  as  follows  from  (1). 
Multiplying  (27)  by  (— *7i)2  and  substituting  for -in  djdqr  and  —indldq, 
their  values  given  by  (29),  we  get 

(P-fr)(Ps-L)  = (p-fs)(Pr-fr), 

which  reduces,  with  the  help  of  the  quantum  condition  prps  = pspr,  to 

Prfs~hfrPs  ~ V.<f rJrfsPr- 
This  reduces  further,  with  the  help  of  (31),  to 

&fsl8<lr  = <frldti*,  (33) 

showing  that  the  functions  fr  are  all  of  the  form 


fr  = sF/dqr  (34) 

with  F independent  of  r.  Equation  (29)  now  becomes 

Pr  — —iti  c'!dqr  -f-  o F 1 8qr . (35) 

W e have  been  working  with  a representation  which  is  fixed  to  the 
extent  that  the  q’s  must  be  diagonal  in  it,  but  which  contains  arbitrary 
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phase  factors.  If  the  phase  factors  are  changed,  the  operators  8jdqr 
get  changed.  It  will  now  be  shown  that,  by  a suitable  change  in  the 
phase  factors,  the  function  F in  (35)  can  be  made  to  vanish,  so  that 
equations  (28)  are  made  to  hold. 

Using  stars  to  distinguish  quantities  referring  to  the  new  repie- 
sentation  with  the  new  phase  factors,  we  shall  have  the  new  basic 
bras  connected  with  the  previous  ones  by 

<q'i-qn*\  = Fy\q...q'n\  (36) 


where  y'  - y(q')  is  a real  function  of  the  q” s.  The  new  representa- 
tive of  a ket  is  e*/  times  the  old  one,  showing  that  C'o/v*  = <A>,  so 
we  get  y*  _ e-f yy  (37) 

as  the  connexion  between  the  new  standard  ket  and  the  original  one. 
The  new  linear  operator  (d/dq,)*  satisfies,  corresponding  to  (22), 


Hr  8qr  « 


with  the  help  of  (37).  Using  (22),  this  gives 


\Hr) 


Hr 


Hr 


show'ing  that 

/JL\*  = e->v---e'y, 

\Hrl  Hr 

(38) 

or,  with  the  help  of  (30), 

( ±)*  = ± + i^. 
\Hrl  Hr  Hr 

(39) 

By  choosing  y so  that 

F = fty-f-  a constant, 

(40) 

(35)  becomes 

pr  = -Hi(djdqr)*. 

(41) 

Equation  (40)  fixes  y except  for  an  arbitrary  constant,  so  the  repre- 
sentation is  fixed  except  for  an  arbitrary  constant  phase  factor. 

In  this  way  we  see  that  a representation  can  be  set  up  in  which 
the  q’ s are  diagonal  and  equations  (28)  hold.  This  representation  is 
a very  useful  one  for  many  problems.  It  will  be  called  Schrodmger  s 
representation,  as  it  was  the  representation  in  terms  of  which  Schro- 
dinger  gave  his  original  formulation  of  quantum  mechanics  in  1926. 
Schrodinger’s  representation  exists  whenever  one  has  canonical  q s 
and  p’s,  and  is  completely  determined  by  these  q s and  p s except  for 
an  arbitrary  constant  phase  factor.  It  owes  its  great  convenience  to 
its  allowing  one  to  express  immediately  any  algebraic  function  of  the 
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qs  and  p s of  the  form  of  a power  series  in  the  p.’s  as  an  operator  of 
differentiation,  e.g.  if f(qlt...,qn,  plt...,pn)  is  such  a function,  we  have 

— ihdjdq —ifi  8/8qn),  (42) 

provided  we  preserve  the  order  of  the  factors  in  a product  on  substi- 
tuting the  —ihd/dq’s  for  the  p's. 

From  (23)  and  (28),  we  have 


Pr>  = <)•  (43) 

Thus  the  standard  ket  in  Schrodinger’s  representation  is  characterized 
by  the  condition  that  it  is  a simultaneous  eigenket  of  all  the  momenta 
belonging  to  the  eigenvalues  zero.  Some  properties  of  the  basic 
vectors  of  Schrodinger’s  representation  may  also  be  noted.  Equation 
(22)  gives 


Hr 


Hence 


Wi—9n\dqr  g?-<9i  ■■•?*!> 


so  that 

Similarly,  equation  (24)  leads  to 


H'l-<ln\Pr  = -ib^Wl-q'n  I- 


(44) 

(45) 


PrWi-q'n)  = ih— 7 \q'1...qn'). 


(46) 


23.  The  momentum  representation 

Let  us  take  a system  with  one  degree  of  freedom,  describable  in 
terms  of  a q and  p with  the  eigenvalues  of  q miming  from  — oo  to  oo, 
and  iet  us  take  an  eigenket  | p'y  of  p.  Its  representative  in  the  Schro- 
dinger  representation,  <q'\p'>,  satisfies 

P'W>  = W\P\P'>  = -ih  ±(q'\p'y, 

with  the  help  of  (45)  applied  to  the  case  of  one  degree  of  freedom. 
The  solution  of  this  differential  equation  for  (q’  | p'y  is 

<.q'\p'>  = c'e^W  (47) 

where  c = c(p  ) is  independent  of  q',  but  may  involve  p' . 

The  representative  <q'  \p")  does  not  satisfy  the  boundary  conditions 
of  vanishing  at  q'  = ±oo.  This  gives  rise  to  some  difficulty,  which 
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shows  itself’  up  most  directly  in  the  failure  of  the  orthogonality 
theorem.  If  we  take  a second  eigenket  \p"'}  of  p with  representative 

<g'  [p">  = c" 

belonging  to  a different  eigenvalue  p",  we  shall  have 

OO  00 

<p'\p">  = J <p'| q'y  dq'  < q'\p">  = c'c " J dg'.  (48) 

—CO  -co 

This  integral  does  not  converge  according  to  the  usual  definition  of 
convergence.  To  bring  the  theory  into  order,  we  adopt  a new  defini- 
tion of  convergence  of  an  integral  whose  domain  extends  to  infinity, 
analogous  to  the  Cesaro  definition  of  the  sum  of  an  infinite  series. 
With  this  new  definition,  an  integral  whose  value  to  the  upper  limit 
q'  is  of  the  form  cos  aq'  or  sinag',  with  a a real  number  not  zero,  is 
counted  as  zero  when  q'  tends  to  infinity,  i.e.  we  take  the  mean  value 
of  the  oscillations,  and  similarly  for  the  lower  limit  of  q'  tending  to 
minus  infinity.  This  makes  the  right-hand  side  of  (48)  vanish  for 
p"  z£  p',  so  that  the  orthogonality  theorem  is  restored.  Also  it  makes 
the  right-hand  sides  of  (13)  and  (14)  equal  when  < <j>  and  ip}  are  eigen- 
vectors of  p,  so  that  eigenvectors  of  p become  permissible  vectors  to 
use  with  the  operator  djdq.  Thus  the  boundary  conditions  that  the 
representative  of  a permissible  bra  or  ket  has  to  satisfy  become 
extended  to  allow  the  representative  to  oscillate  like  cosag'  or  sinag' 
as  g'  goes  to  infinity  or  minus  infinity. 

For  p"  very  close  to  p' , the  right-hand  side  of  (48)  involves  a 8 
function.  To  evaluate  it,  we  need  the  formula 


00 


J 


eiax  dx  — 27r8(a) 


(49) 


for  real  a,  which  may  be  proved  as  follows.  The  formula  evidently 
holds  for  a different  from  zero,  as  both  sides  are  then  zero.  Further 
we  have,  for  any  continuous  function  f(a). 


oo  g . cu 

| /(a)  da  J eiax  dx  — J /(a)  da  2a-1  sin  ag  = 2tt/(0) 


in  the  limit  when  g tends  to  infinity.  A more  complicated  argument 
shows  that  we  get  the  same  result  if  instead  of  the  limits  g and  — g 
we  put  gx  and  — g2,  and  then  let  and  g2  tend  to  infinity  in  different 
ways  (not  too  widely  different).  This  shows  the  equivalence  of  both 
sides  of  (49)  as  factors  in  an  integrand,  which  proves  the  formula. 
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With  the  help  of  (49),  (48)  becomes 

<Z»'lf>'>=  c'c"2n8[(p'~p”)/ti]  = &c”h8(p'-p") 

= \c'\-Jt8(p'—p").  (50) 

W e have  obtained  an  eigenket  of  p belonging  to  any  real  eigenvalue 
p , its  representative  being  given  by  (47).  Any  ket  |X>  can  be  ex- 
panded in  terms  ot  these  eigenkets  of  p,  since  its  representative 
(q'\X)  can  be  expanded  in  terms  of  the  representatives  (47)  by 
Fourier  analysis.  It  follows  that  the  momentum  p is  an  observable, 
in  agreement  with  the  experimental  result  that  momenta  can  be 
observed. 

A symmetry  now  appears  between  q and  p.  Each  of  them  is  an 
observable  with  eigenvalues  extending  from  —00  to  00,  and  the 
commutation  relation  connecting  q and  p,  equation  (10),  remains 
invariant  if  we  interchange  q and  p and  w rite  -i  for  i.  We  have  set 
up  a representation  in  which  q is  diagonal  and  p = — ihd/dq.  It 
ioliows  from  the  symmetry  that  we  can  also  set  up  a representation 
in  which  p is  diagonal  and 

q = ihdjdp,  (51) 

the  operator  djdp  being  defined  by  a procedure  similar  to  that  used 
for  d/dq.  This  representation  will  be  called  the  moment  tun  representa- 
tion. It  is  less  useful  than  the  previous  Schrddinger  representation 
because,  w hile  the  Schrddinger  representation  enables  one  to  express 
as  an  operator  of  differentiation  any  function  of  q and  p that  is  a 
power  series  in  p,  the  momentum  representation  enables  one  so  to 
expiess  any  function  of  q and  p that  is  a power  series  in  q,  and  the 
important  quantities  in  dynamics  are  almost  always  power  series  in 
p but  are  often  not  power  series  in  q.  All  the  same  the  momentum 
representation  is  of  value  for  certain  problems  (see  § 50). 

l.et  us  calculate  the  transformation  function  (,q'\p'y  connecting  the 
two  representations.  The  basic  kets \p ')  of  the  momentum  representa- 
tion are  eigenkets  of  p and  their  Schrddinger  representatives  \q'\p') 
are  given  by  (47)  with  the  coefficients  c'  suitably  chosen.  The  phase 
factors  of  these  basic  kets  must  be  chosen  so  as  to  make  (51)  hold. 
The  easiest  way  to  bring  in  this  condition  is  to  use  the  symmetry 
between  q and  p referred  to  above,  according  to  which  (<q\p')  must 
go  over  into  <p'|(?'>  if  we  interchange  q and  p and  write  —i  for  i. 
Now  <<?']/>'>  is  equal  to  the  right-hand  side  of  (47)  and  <p'\q')  to  the 
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conjunct.  complex  expression,  end  hence  o’  must  be  mdependent  of 
p’.  Thus  c'  is  just  a number  c.  Further,  we  must  have 

<p'\p”>  = S(P’~P")’ 

•ci,  (k n\  that  iri  = ft-i.  We  can  choose 
ahows  Oil  comparison  with  (50),  tnat  \c\ 

1 constant  phase  factor  in  either  representation  so  as  to 
make  c = h~K  and  we  then  get 

<g>'>  = 

f0rTte  may  easily  be  generalized  to  a system  with 

» degrees  o”f  freedom,  describablc  in  t • ^^I  then  be 
eigenvalues  of  each  q running  , to  00  and  there 

an  observable  the  set  ’of  p’s,  the 

— 

rentoUon  »P  •»  **“  f/s  a“  di*8°'’“l  C“h 

,Je/f'pr.  *“■») 

ZXrre-b  degree  of  freedom  separately,  as  is  shown  by 
formula  (01)  of  § 20,  and  will  thus  be  , 

(qW,.MpW,  "«.>  ” 

_ Jl-nlieH.v'lu[+PiV2+-+P«,Olh.  (54) 

94  Heisenberg’s  principle  of  uncertainty 

Fo“  » system  title  one  degree  of  freedom,  the  Scbrod.nger  and  the 
momentum  representatives  of  a hot  |X>  are  connected  by 

cO 

<p'|.X>  — ft-i  J e-ii‘plh  dq'  (q'\X)> 


<3'|X>  = ft'1  J e*p',h  d'P’  <P'\X>‘ 


(55) 


h Ore  amp Bad,,  o/  * "TT“e”t' * « Schwinger  repre- 
J-  Tbrs  is  a function 
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§ 24 


whose  value  is  very  small  everywhere  outside  a certain  domain,  of 
width  A q'  say,  and  inside  this  domain  is  approximately  periodic  with 
a definite  frequency.f  If  a Fourier  analysis  is  made  of  such  a wave 
packet,  the  amplitude  of  all  the  Fourier  components  will  be  small, 
except  those  in  the  neighbourhood  of  the  definite  frequency.  The 
components  whose  amplitudes  are  not  small  will  fill  up  a frequencyf 
band  whose  width  is  of  the  order  1/A q' , since  two  components  whose 
frequencies  differ  by  this  amount,  if  in  phase  in  the  middle  of  the 
domain  A q , will  be  just  out  of  phase  and  interfering  at  the  ends  of 
this  domain.  Now  in  the  first  of  equations  (55)  the  variable 
(2  ir)~y/h  = p'/h  plays  the  part  of  frequency.  Thus  with  (q'  |X>  of  the 
form  of  a wave  packet,  the  function  <jj'|X>,  being  composed  of  the 
amplitudes  of  the  Fourier  components  of  the  wave  packet,  will  be 
small  everywhere  in  the  p -space  outside  a certain  domain  of  width 
A p'  = h/Aq'. 

Let  us  now  apply  the  physical  interpretation  of  the  square  of  the 
modulus  of  the  representative  of  a ket  as  a probability.  We  find  that 
our  wave  packet  represents  a state  for  which  a measurement  of  q is 
almost  certain  to  lead  to  a result  lying  in  a domain  of  width  A q'  and 
a measurement  of  p is  almost  certain  to  lead  to  a result  lying  in  a 
domain  of  width  A p'.  We  may  say  that  for  this  state  q has  a definite 
value  with  an  error  of  order  A q'  and  p has  a definite  value  with  an 
error  of  order  A p . I he  product  of  these  two  errors  is 


Aq'Ap'  = h.  (56) 

Thus  the  more  accurately  one  of  the  variables  q,p  has  a definite 
value,  the  less  accurately  the  other  has  a definite  value.  For  a system 
with  several  degrees  of  freedom,  equation  (56)  applies  to  each  degree 
of  freedom  separately. 

Equation  (56)  is  known  as  Heisenberg's  Principle  of  Uncertainty. 
It  shows  clearly  the  limitations  in  the  possibility  of  simultaneously 
assigning  numerical  values,  for  any  particular  state,  to  two  non- 
commuting observables,  when  those  observables  are  a canonical  co- 
ordinate and  momentum,  and  provides  a plain  illustration  of  how 
observations  in  quantum  mechanics  may  be  incompatible.  It  also 
shows  how  classical  mechanics,  which  assumes  that  numerical  values 
can  be  assigned  simultaneously  to  all  observables,  may  be  a valid 
approximation  when  h can  be  considered  as  small  enough  to  be 

t Frequency  here  means  reciprocal  of  wave-length. 
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negligible.  Equation  (56)  holds  only  in  the  most  favourable  case, 
which  occurs  when  the  representative  of  the  state  is  of  the  form  of  a 
wave  packet.  Other  forms  of  representative  would  lead  to  a A q'  and 
A p'  whose  product  is  larger  than  h. 

Heisenberg’s  principle  of  uncertainty  shows  that,  in  the  limit  when 
either  q or  p is  completely  determined,  the  other  is  completely 
undetermined.  This  result  can  also  be  obtained  directly  from  the 
transformation  function  (q’\p'y.  According  to  the  end  of  § 18, 
\<q'\p'>\*dq'  is  proportional  to  the  probability  of  q having  a value  in 
the  small  range  from  q'  to  q'  -\-dq’  for  the  state  for  which  p certainly 
has  the  value  p',  and  from  (52)  this  probability  is  independent  of  q' 
for  a given  dq' . Thus  if  p certainly  has  a definite  value  p' , all  values 
of  q are  equally  probable.  Similarly,  if  q certainly  has  a definite  value 
q , all  values  of  p are  equally  probable. 

It  is  evident  physically  that  a state  for  which  all  values  of  q are 
equally  probable,  or  one  for  which  all  values  of  p are  equally  probable, 
cannot  be  attained  in  practice,  in  the  first  case  because  of  limitations 
of  size  and  in  the  second  because  of  limitations  of  energy.  Thus  an 
eigenstate  of  p or  an  eigenstate  of  q cannot  be  attained  in  practice. 
The  argument  at  the  end  of  § 12  already  showed  that  such  eigenstates 
are  unattainable,  because  of  the  infinite  precision  that  would  be 
needed  to  set  them  up,  and  we  now  have  another  argument  leading 
to  the  same  conclusion. 

25.  Displacement  operators 

We  get  a new  insight  into  the  meaning  of  some  of  the  quantum  con- 
ditions by  making  a study  of  displacement  operators.  These  appear 
in  the  theory  when  we  take  into  consideration  that  the  scheme  of 
relations  between  states  and  dynamical  variables  given  in  Chapter  II 
is  essentially  a physical  scheme,  so  that  if  certain  states  and  dynamical 
variables  are  connected  by  some  relation,  on  our  displacing  them  all 
in  a definite  way  (for  example,  displacing  them  all  through  a distance 
Sz  in  the  direction  of  the  z-axis  of  Cartesian  coordinates),  the  new 
states  and  dynamical  variables  would  have  to  be  connected  by  the 
same  relation. 

The  displacement  of  a state  or  observable  is  a perfectly  definite 
process  physically.  Thus  to  displace  a state  or  observable  through  a 
distance  8a:  in  the  direction  of  the  a;-axis,  we  should  merely  have  to 
displace  all  the  apparatus  used  in  preparing  the  state,  or  all  the 


100 


THE  QUANTUM  CONDITIONS 


25 


apparatus  required  to  measure  the  observable,  through  the  distance 
S.r  in  the  direction  of  the  .c-axis,  and  the  displaced  apparatus  would 
define  the  displaced  state  or  observable.  The  displacement  of  a 
dynamical  variable  must  be  just  as  definite  as  the  displacement  of 
an  observable,  because  of  the  close  mathematical  connexion  between 
dynamical  variables  and  observables.  A displaced  state  or  dynamical 
variable  is  uniquely  determined  by  the  undisplaced  state  or  dynami- 
cal variable  together  with  the  direction  and  magnitude  of  the  dis- 
placement. 

The  displacement  of  a ket  vector  is  not  such  a definite  thing  though. 
If  we  take  a certain  ket  vector,  it  will  represent  a certain  state  and  we 
may  displace  this  state  and  get  a perfectly  definite  new  state,  but  this 
new  state  will  not  determine  our  displaced  ket,  but  only  the  direction 
of  our  displaced  ket.  We  help  to  fix  our  displaced  ket  by  requiring 
that  it  shall  have  the  same  length  as  the  undisplaced  ket,  but  even 
then  it  is  not  completely  determined,  but  can  still  be  multiplied  by 
an  arbitrary  phase  factor.  One  would  think  at  first  sight  that  each 
ket  one  displaces  would  have  a different  arbitrary  phase  factor, 
but  with  the  help  of  the  following  argument,  we  see  that  it  must  be 
the  same  for  them  all.  We  make  use  of  the  law  that  superposition 
relationships  between  states  remain  invariant  under  the  displace- 
ment. A superposition  relationship  between  states  is  expressed 
mathematically  by  a linear  equation  between  the  kets  corresponding 
to  those  states,  for  example 

iU>  = c1!^4>-1-c2!Z?>,  (57) 

where  and  c.,  are  numbers,  and  the  invariance  ot  the  superposition 
relationship  requires  that  the  displaced  states  correspond  to  kets 
with  the  same  linear  equation  between  them — in  our  example  they 
would  correspond  to  \Rd},  \Ad'y,  j Bd)  say,  satisfying 

\Rdy  = c^Ady+C'ABdy.  (58) 

We  take  these  kets  to  be  our  displaced  kets,  rather  than  these  kets 
multiplied  by  arbitrary  independent  phase  factors,  which  latter 
kets  would  satisfy  a linear  equation  with  different  coefficients  c1:  c2. 
The  only  arbitrariness  now  left  in  the  displaced  kets  is  that  of  a single 
arbitrary  phase  factor  to  be  multiplied  into  all  of  them. 

The  condition  that  linear  equations  between  the  kets  remain  in- 
variant under  the  displacement  and  that  an  equation  such  as  (58) 
holds  whenever  the  corresponding  (57)  holds,  means  that  the  dis- 
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S^k.UM-!“r,frCti°?  °f "ndi«P1»<*d  k«te  a«d  thus  each 
P I </>  is  the  result  of  some  linear  operator  applied  to  the 

corresponding  undisplaced  ket  |P>.  In  symbols, 

|Pd>  = PIP),  ’ (5S) 

where  D is  a linear  operator  independent  of  |P>  and  depending  onlv 
on  the  displacement.  The  arbitrary  phase  factor  by  which  all  the 

toThreltentofT 7 hll QUltiPlied  ^ D ^ “"determined 

Wi  hT110?11  ^ numerical  fact"  of  modulus  unity. 

an7  It  * dfplaC6ment  0f  kets  mad®  definite  in  the  above  manner 
and  the  displacement  of  bras,  of  course,  made  equally  definite 

roug  their  being  the  conjugate  imaginaries  of  the  kets  we  can 
now  assert  that  any  symbolic  equation  between  kets  bras  and 

T mUSt  remain  invariai*  ™der  the  displacement 
° every  symbol  occurring  in  it,  on  account  of  such  an  equation 

,ignifiCOnC,,  ""  P*  banged  by  the 

Take  as  an  example  the  equation 

<61^)  = c, 

c being  a number.  Then  we  must  have 

«**>_._«„>.  (60) 
hrom  the  conjugate  imaginary  of  (59)  with  Q instead  of  P, 

<Qd\  = «2|p.  ’ {0 

Hence  (60)  gives  (Q\DD\Py  = (Q\Py. 

Since  this  holds  for  arbitrary  <Q|  and  |P>,  we  must  have 

= (62) 

giving  us  a general  condition  which  D has  to  satisfy. 

Take  as  a second  example  the  equation 

v\py  = i r>, 

where  via  any  dynamical  variable.  Then,  using  vd  to  denote  the 
displaced  dynamical  variable,  we  must  have 

vd\Pdy  = \Rd>. 

With  the  help  of  (59)  we  get 

vd\ Pdy  = D\By  = Dv\py  = DvD~l\Pdy. 

Since  \Pdy  can  be  any  ket,  we  must  have 

va  = DvD-1, 


3593-57 
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which  shows  that  the  linear,  operator  D determines  the  displacement 
of  dynamical  variables  as  well  as  that  of  kets  and  bras.  Note  that 
the  arbitrary  nHmerical  factor  of  modulus  unity  in  D does  not  affect 
va,  and  also  it  does  not  affect  the  validity  of  (62). 

Let  us  now  pass  to  an  infinitesimal  displacement,  i.e.  taking  the 
displacement  through  the  distance  Sx  in  the  direction  of  the  a: -axis, 
let  us  make  Sx  0.  From  physical  continuity  we  should  expect 
a displaced  ket  \Pd}  to  tend  to  the  original  |P>  and  we  may  further 
expect  the  limit 


oX  &r-*0  OX 

to  exist.  This  require  that  the  limit 


lim  (D—l)/Bx  (64) 

shall  exist.  This  limit  is  a linear  operator  which  we  shall  call  the 
displacement  operator  for  the  x-direction  and  denote  by  dx.  The 
arbitraiy  numerical  factor  eiv  with  y real  which  we  may  multiply 
into  D must  be  made  to  tend  to  unity  as  Sx  -»  0 and  then  introduces 
an  arbitrariness  in  dx,  namely,  dx  may  be  replaced  by 

lim  (Deb'—  1)/Sx  = lim  (D—  1 + t'y)/8x dx+t'a,, 

6x-*0  Sx— >0 


where  ax  is  the  limit  of  y/Sx.  Thus  dx  contains  an  arbitrary  additive 
pure  imaginary  number. 

For  Sx  small  D — l-f-8xdx.  (66) 

Substituting  this  into  (62),  we  get 

( 1 -f  Sx  Jx)(  1 + Sx  dx)  = 1, 
which  reduces,  with  neglect  of  Sx2,  to 


8x(c?x-f<ix)  = 0. 

Thus  dx  is  a pure  imaginary  linear  operator.  Substituting  (66)  into 
(63)  we  get,  with  neglect  of  Sx2  again, 

v,i  — (l+8xdx)i>(l— Bxdx)  = v+S x(dxv—v  dx),  (66) 

showing  that  lim  (vd—v)jBx  = dxv—vdx.  (67) 


We  may  describe  any  dynamical  system  in  terms  of  the  following 
dynamical  variables:  the  Cartesian  coordinates  x,  y,  z of  the  centre  of 
mass  of  the  system,  the  components  px,pu,pz  of  the  total  momentum 
of  the  system,  which  are  the  canonical  momenta  conjugate  to  x,y,z 
respectively,  and  any  dynamical  variables  needed  for  describing 
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internal  degrees  of  freedom  of  the  system.  If  we  suppose  a piece 
of  apparatus  which  has  been  set  up  to  measure  *,  to  be  displaced  a 
distance  Bx  in  the  direction  of  the  z-axis,  it  will  measure  z Bx,  hen 

xd  = x~Bx. 

Comparing  this  with  (66)  for  v = x,  we  obtain 

dxx—xdx  = — 1.  ^ 

This  is  the  quantum  condition  connecting  dx  with  x.  From  jm*_ 
arguments  we  find  that  y,z,Px,  ft,  P,  and  the  internal  dynamical  vj- 
abL  which  are  unaffected  by  the  displacement,  must  commute  with 
"ring  these  results  with  (9),  we  see  that  ihdx  satishes  just 
Z quotum  conditions  as  Their  difference,  p -.U 

commutes  with  all  the  dynamical  variables  and  must  therefore  ^ 
number.  This  number,  which  is  necessarily  real  since  p and  ihd  nr, 
both  real  may  be  made  zero  by  a suitable  choice  of  the  arbitrary, 
p«™  imaginary  number  that  can  be  added  to  d,.  We  then  have  the 

result  px  = ihdx, 

or  the  x-component  of  the  total  momentum  of  the  system  is  ih  times  the 

JiSrrrSTell  moult,  which  . new  significance 

displacement  operators.  There  is  a corresponding  result, of ^com_ 
also  for  the  y and  z displacement  operators  du  and  q 

condition,  which  state  that  p„  and  commute  with  each  oth  r 
are  now  seen  to  be  connected  with  the  fact  that  displacement 
different  directions  are  commutable  operations. 

26  Unitary  transformations 

Let  V be  any  linear  operator  that  has  a reciprocal  U~  and  con- 
sider the  equation  a*  _ U odJ-1, 

« being  an  arbitrary  linear  operator.  This  equation  may  be  regarded 
a,  expressing  a transformation  from  any  hncar  operator  .to. 
corresponding  linear  operator  «♦,  and  as  such  it  has  rather  remarkable 
ZZL  In  the  first  place  it  should  be  noted  that  each  «*  has  he 
same  eigenvalues  as  the  corresponding  «;  since,  if  « » any  eigenva 
of  a and  lot')  is  an  eigenket  belonging  to  it,  we  have 

ot|a'>  = «'!«'> 

and  hence  UaD^vW>  - O.W>  = «'W>, 
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showing  that  V |a'>  is  an  eigenket  of  a*  belonging  to  the  same  eigen- 
value a,  and  similarly  any  eigenvalue  of  a*  may  be  shown  to  be  also 
an  eigenvalue  of  a.  Further,  if  we  take  several  ex’s  that  are  connected 
by  algebraic  equations  and  transform  them  all  according  to  (70),  the 
corresponding  a*’s  will  be  connected  by  the  same  algebraic  equations. 
This  result  follows  from  the  fact  that  the  fundamental  algebraic  pro-  < 
cesses  of  addition  and  multiplication  are  left  invariant  by  the  trans- 
formation (70),  as  is  shown  by  the  following  equations: 


(«!+<*„)*  = Ufa+ocJU-1  = UatiU-'+Uat,  U-1  = a*  + a*, 
K*,)*  = £/oq«2  f/-1  = Ucxl  U-'Uac,  IJ-1  = a* a*. 

Let  us  now  see  what  condition  would  be  imposed  on  U by  the 
requirement  that  any  real  « transforms  into  a real  a*.  Equation 
(70.)  may  be  written  ^ = ^ 

Taking  the  conjugate  complex  of  both  sides  in  accordance  with 
(5)  of  § 8 we  find,  if  a and  a*  are  both  real, 

Doc*  = cxU.  (72) 

Equation  (71)  gives  us  Ua*U  = UUoc 
and  equation  (72)  gives  us 

Ua*U  = txOU. 


Hence  UU«  = *UU. 

Thus  U U commutes  with  any  real  linear  operator  and  therefore  also 
with  any  linear  operator  whatever,  since  any  linear  operator  can  be 
expressed  as  one  real  one  plus  i times  another.  Hence  UU  is  a 
number.  It  is  obviously  real,  its  conjugate  complex  according  to  (5) 
of  § 8 being  the  same  as  itself,  and  further  it  must  be  a positive 
number,  since  for  any  ket  |P>,  (P\UU\P}  is  positive  as  well  as 
<P|P>.  We  can  suppose  it  to  be  unity  without  any  loss  of  generality 
in  the  transformation  (70).  We  then  have 

UU  = 1.  (73) 

Equation  (73)  is  equivalent  to  any  of  the  following 

u = U-\  U = U~\  U-K 7-1  = 1.  (74) 

A matrix  or  linear  operator  U that  satisfies  (73)  and  (74)  is  said 
to  be  unitary  and  a transformation  (70)  with  unitary  U is  called  a 
unitary  transformation.  A unitary  transformation  transforms  real 
linear  operators  into  real  linear  operators  and  leaves  invariant  any 
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algebraic  equation  between  linear  operators.  It  may  be  considered 
as  applying  also  to  kets  and  bras,  in  accordance  with  the  equations 
I p*>  = U\py,  <P*  | = <P\U  = <P\U-\  (75) 

and  then  it  leaves  invariant  any  algebraic  equation  between  linear 
operators,  kets,  and  bras.  It  transforms  eigenvectors  of  a into  eigen- 
vectors of  a*.  From  this  one  can  easily  deduce  that  it  transforms  an 
observable  into  an  observable  and  that  it  leaves  invariant  any  func- 
tional relation  between  observables  based  on  the  general  definition 
of  a function  given  in  § 11. 

The  inverse  of  a unitary  transformation  is  also  a unitary  trans- 
formation, since  from  (74),  if  U is  unitary,  U'1  is  also  unitary. 
Further,  if  two  unitary  transformations  are  applied  in  succession, 
the  result  is  a third  unitary  transformation,  as  may  be  verified  in 
the  following  way.  Let  the  two  unitary,  transformations  be  (70)  and 

od  = Fa*F"1. 

The  connexion  between  od  and  a is  then 

= VUaU-W-1 

= (VU)ot(VU)-1  (76) 

from  (42)  of  § 11.  Now  VU  is  unitary  since 

VUVU  = UVVU  — UU  = 1, 

and  hence  (76)  is  a unitary  transformation. 

The  transformation  given  in  the  preceding  section  from  undisplaced 
to  displaced  quantities  is  an  example  of  a unitary  transformation,  as 
is  shown  by  equations  (62),  (63),  corresponding  to  equations  (73), 
(70),  and  equations  (59),  (61),  corresponding  to  equations  (75). 

In  classical  mechanics  one  can  make  a transformation  from  the 
canonical  coordinates  and  momenta  qr,pr  (r  = l,v,»)  to  a new  set  of 
variables  q*,p*  (r  — l,..,n)  satisfying  the  same  P.B.  relations  as  the 
q' s and  p’s,  i.e.  equations  (8)  of  § 21  with  g*’s  andp*’s  replacing  the 
q's  and  p’s,  and  can  express  all  dynamical  variables  in  terms  of  the  q*  s 
and  p*’ s.  The  q*’s  and  p*' s are  then  also  called  canonical  coordinates 
and  momenta  and  the  transformation  is  called  a contact  transforma- 
tion. One  can  easily  verify  that  the  P.B.  of  any  two  dynamical 
variables  u and  v is  correctly  given  by  formula  (1)  of  § 21  with  q*' s and 
p*’ 8 instead  of  q’s  and  p’s,  so  that  the  P.B.  relationship  is  invariant 
under  a contact  transformation.  This  results  in  the  new  canonical 
coordinates  and  momenta  being  on  the  same  footing  as  the  original 
ones  for  many  purposes  of  general  dynamical  theory,  even  though  the 
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new  coordinates  q*  may  not  be  a set  of  Lagrangian  coordinates  but 
may  be  functions  of  the  Lagrangian  coordinates  and  velocities. 

It  will  now  be  shown  that,  for  a quantum  dynamical  system  that 
has  a classical  analogue,  unitary  transformations  in  the  quantum  theory 
are  the  analogue  of  contact  transformations  in  the  classical  theory. 
Unitary  transformations  are  more  general  than  contact  transforma- 
tions, since  -the  former  can  be  applied  to  systems  in  quantum 
mechanics  that  have  no  classical  analogue,  but  for  those  systems  in 
quantum  mechanics  which  are  describable  in  terms  of  canonical 
coordinates  and  momenta,  the  analogy  between  the  two  kinds  of 
transformation  holds.  To  establish  it,  we  note  that  a unitary  trans- 
formation applied  to  the  quantum  variables  qr,  pr  gives  new  variables 
q*,p*  satisfying  the  same  P.B.  relations,  since  the  P.B.  relations  are 
equivalent  to  the  algebraic  relations  (9)  of  § 21  and  algebraic  relations 
are  left  invariant  by  a unitary  transformation.  Conversely,  any  real 
variables  q*,p*  satisfying  the  P.B.  relations  for  canonical  coordinates 
and  momenta  are  connected  with  the  qr,pr  by  a unitary  transforma- 
tion, as  is  shown  by  the  following  argument. 

We  use  the  Schrodinger  representation,  and  write  the  basic  ket 
\q'i...q'ny  as  |<7'>  for  brevity.  Since  we  are  assuming  that  the  q*,p* 
satisfy  the  P.B.  relations  for  canonical  coordinates  and  momenta, 
we  can  set  up  a Schrodinger  representation  referring  to  them,  with 
the  q*  diagonal  and  each  p*  equal  to  — ihd/dq*.  The  basic  kets  in 
this  second  Schrodinger  representation  will  be  \q*' ■■  ■$%)>,  which  we 
write  \q*r)  for  brevity.  Now  introduce  the  linear  operator  U defined  by 
<q*'\U\q’>  = 8(q*'-q'),  (77) 

where  S(q*'—q')  is  short  for 

%*'-<?')  = 8 {qr-^mr-q^-Mqr-q’n)-  (78) 

The  conjugate  complex  of  (77)  is 

<q,\U\q*,>  = B(q*’-q’), 

and  hencef 

<q'\UU\q">  = j <tf  \U \q*")  dq*'  <q*'\U\q’> 

= / S(<r-?')  dq*'  Hq*'-q") 

= § (q'-q"), 

so  that  UU  — 1. 


f We  use  the  notation  of  a single  integral  sign  and  dq*'  to  denote  an  integral  over 
all  the  variables  q*',  q*',...,  q*'.  This  abbreviation  will  be  used  also  in  future  work. 
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Thus  U is  a unitary  operator.  We  have  further 
<q*'\q*U\q'y  = q*'S(q*’-q') 
and  \q*'\Uqr\q'>  = 8(q*'—q')q'r. 

The  right-hand  sides  of  these  two  equations  are  equal  on  aocount  of 
the  property  of  the  S function  (11)  of  § 15,  and  hence 

qfU  = Uqr 

or  q*  = Uqr  U~\ 

Again,  from  (45)  and  (46), 


<q*'\p*U\q'y 


-q'), 

cqT 


iq*'\Upr\q'y  = ih~8(q*'-q'). 

0(lr 

The  right-hand  sides  of  these  two  equations  are  obviously  equal,  and 

hence  *rT  rj 

p*U  = Upr 

or  p*=UprU-K 

Thus  all  the  conditions  for  a unitary  transformation  are  verified. 

We  get  an  infinitesimal  unitary  transformation  by  taking  U in  (70) 
to  differ  by  an  infinitesimal  from  unity.  Put 

U = 1 +UF, 

where  e is  infinitesimal,  so  that  its  square  can  be  neglected.  Then 

U~x  = l — ieF. 

The  unitary  condition  (73)  or  (74)  requires  that  F shall  be  real.  The 
transformation  equation  (70)  now  takes  the  form 


= (l-ft'e^)a(l— ieiP), 
a*  — a = l€(F<x  — aF). 


which  gives  a*-- a = ie(F<x—ccF).  (70) 

It  may  be  written  in  E\  B.  notation 

a*  — a — e#.[c*,  F\  (80) 

If  a is  a canonical  coordinate  or  momentum,  this  is  formally  the  same 
as  a classical  infinitesimal  contact  transformation. 
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27.  Schrodinger’s  form  for  th^  equations  of  motion 
Ouk  work  from  § 5 onwards  has  all  been  concerned  with  one  instant 
of  time.  It  gave  the  general  scheme  of  relations  between  states  and 
dynamical  variables  for  a dynamical  system  at  one  instant  of  time. 
To  get  a complete  theory  of  dynamics  we  must  consider  also  the 
connexion  between  different  instants  of  time.  When  one  makes  an 
observation  on  the  dynamical  system,  the  state  of  the  system  gets 
changed  in  an  unpredictable  way,  but  in  between  observations 
causality  applies,  in  quantum  mechanics  as  in  classical  mechanics, 
and  the  system  is  governed  by  equations  of  motion  which  make  the 
state  at  one  time  determine  the  state  at  a later  time.  These  equations 
of  motion  we  now  proceed  to  study.  They  will  apply  so  long  as  the 
dynamical  system  is  left  undisturbed  by  any  observation  or  similar 
process. t Their  general  form  can  be  deduced  from  the  principle  of 
superposition  of  Chapter  I. 

Let  us  consider  a particular  state  of  motion  throughout  the  time 
during  which  the  system  is  left  undisturbed.  We  shall  have  the  state 
at  any  time  t corresponding  to  a certain  ket  which  depends  on  t and 
which  may  be  written  |<>.  If  we  deal  with  several  of  these  states  of 
motion  we  distinguish  them  by  giving  them  labels  such  as  A,  and  we 
then  write  the  ket  which  corresponds  to  the  state  at  time  t for  one 
of  them  \Aty.  The  requirement  that  the  state  at  one  time  determines 
the  state  at  another  time  means  that  \At0y  determines  \Aty  except 
for  a numerical  factor.  The  principle  of  superposition  applies  to  these 
states  of  motion  throughout  the  time  during  which  the  system  is 
undisturbed,  and  means  that  if  we  take  a superposition  relation 
holding  for  certain  states  at  time  <0  and  giving  rise  to  a linear  equation 
between  the  corresponding  kets,  e.g.  the  equation 

I A,)  = cl  l-^o 

the  same  superposition  relation  must  hold  between  the  states  of 
motion  throughout  the  time  during  which  the  system  is  undisturbed 
and  must  lead  to  the  same  equation  between  the  kets  corresponding 

t The  preparation  of  a state  is  a process  of  this  kind.  It  often  takes  the  form  of 
making  an  observation  and  selecting  the  system  when  the  result  of  the  observation 
turns  out  to  be  a certain  pre-assigned  number. 
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to  these  states  at  any  time  t (in  the  undisturbed  time  interval),  i.e. 
the  equation  \Rt>  = Cljii*>+Ci|2»>, 

provided  the  arbitrary  numerical  factors  by  which  these  kets  may  be 
multiplied  are  suitably  chosen.  It  follows  that  the  | Pty’s  are  linear 
functions  of  the  |P<0>’s  and  each  |P<>  is  the  result  of  some  linear 
operator  applied  to  |P/0>.  In  symbols 

|P<>  = T\Pt0y,  (i) 

where  T is  a linear  operator  independent  of  P and  depending  only 
on  t (and  t0). 

We  now  assume  that  each  | Pty  has  the  same  length  as  the  corre- 
sponding |P<0>.  It  is  not  necessarily  possible  to  choose  the  arbitrary 
numerical  factors  by  which  the  |P<>’s  may  be  multiplied  so  as  to 
make  this  so  without  destroying  the  linear  dependence  of  the  \ Pty’s 
on  the  |P<0>’s,  so  the  new  assumption  is  a physical  one  and  not  just 
a question  of  notation.  It  involves  a kind  of  sharpening  of  the 
principle  of  superposition.  The  arbitrariness  in  |P<>  now  becomes 
merely  a phase  factor,  which  must  be  independent  of  P in  order  that 
the  linear  dependence  of  the  \Pty’s  on  the  |P<0>’s  may  be  preserved. 
From  the  condition  that  the  length  of  cl\Pty+ c2\Qty  equals  that  of 
ct\Pl0y -\-ct\Qt0y  for  any  complex  numbers  cv  c2,  we  can  deduce  that 

<Qt\Pty  = <Qt0\Pt0y.  (2) 

The  connexion  between  the  |P*>’s  and  |P<0>’s  is  formally  similar 
to  the  connexion  we  had  in  § 25  between  the  displaced  and  undisplaced 
kets,  with  a process  of  time  displacement  instead  of  the  space  displace- 
ment of  § 25.  Equations  (1)  and  (2)  play  the  part  of  equations  (59) 
and  (60)  of  § 25.  We  can  develop  the  consequences  of  these  equations 
as  in  § 25  and  can  deduce  that  T contains  an  arbitrary  numerical 
factor  of  modulus  unity  and  satisfies 

TT  = 1,  '3) 


corresponding  to  (62)  of  § 25,  so  T is  unitary.  We  pass  to  the  infinitesi- 
mal case  by  making  t ->  t0  and  assume  from  physical  continuity  that 


the  limit 


liin 

t—>tn 


\Pty-\Pt0y 


t-tn 


exists.  This  limit  is  just  the  derivative  of  |P<0>  with  respect  to  t0. 
From  (1)  it  equals 


d\Pt0y 

dta 


( t-*u  t—t0 


\pt0y. 


(4) 
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The  limit  operator  occurring  here  is,  like  (64)  of  § 25,  a pure  imaginary 
linear  operator  and  is  undetermined  to  the  extent  of  an  arbitrary 
additive  pure  imaginary  number.  Putting  this  limit  operator  multi- 
plied by  ih  equal  to  H,  or  rather  H(t0)  since  it  may  depend  on  t0, 
equation  (4)  becomes,  when  written  for  a general  t, 


^ H(t)\Pt>.  (5) 


Equation  (5)  gives  the  general  law  for  the  variation  with  time  of 
the  ket  corresponding  to  the  state  at  any  time.  It  is  Schrodinger’s 
form  for  the  equations  of  motion.  It  involves  just  one  real  linear 
operator  H(t),  which  must  be  characteristic  of  the  dynamical  system 
under  consideration.  We  assume  that  H(t)  is  the  total  energy  of 
the  system.  There  are  two  justifications  for  this  assumption,  (i)  the 
analogy  with  classical  mechanics,  which  will  be  developed  in  the 
next  section,  and  (ii)  we  have  H(t)  appearing  as  ih  times  an  operator 
of  displacement  in  time  similar  to  the  operators  of  displacement  in 
the  x,  y,  and  z directions  of  § 25,  so  corresponding  to  (69)  of  § 25 
we  should  have  H(t)  equal  to  the  total  energy,  since  the  theory  of 
relativity  puts  energy  in  the  same  relation  to  time  as  momentum  to 
distance. 

We  assume  on  physical  grounds  that  the  total  energy  of  a system 
is  always  an  observable.  For  an  isolated  system  it  is  a constant,  and 
may  then  be  written  II.  Even  when  it  is  not  a constant  we  shall  often 
write  it  simply  H,  leaving  its  dependence  on  t understood.  If  the 
energy  depends  on  t,  it  means  the  system  is  acted  on  by  external 
forces.  An  action  of  this  kind  is  to  be  distinguished  from  a distur- 
bance caused  by  a process  of  observation,  as  the  former  is  compatible 
with  causality  and  equations  of  motion  while  the  latter  is  not. 

We  can  get  a connexion  between  H(l)  and  the  T of  equation  (1) 
by  substituting  for  | Pt)  in  (5)  its  value  given  by  equation  (1).  This 


gives 


dT 

ihdt  ]P>n>  = tfWW)- 


Since  |P<„>  may  be  any  ket,  we  have 

ih~  = H{t)T.  (6) 

Equation  (5)  is  very  important  for  practical  problems,  where  it  is 
usually  used  in  conjunction  with  a representation.  Introducing  a 
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representation  with  a complete  set  of  commuting  observables  £ 
diagonal  and  putting  <f  |«>  equal  to  <A(f<)>  we  have,  passing 
standard  ket  notation,  \Pi)  — 

Equation  (5)  now  becomes 


(It 


(7) 


Equation  (7)  is  known  as  Schrodinger's  wave  equation  and  its  solutions 
m are  impendent  Each  solution  “ 

a state  of  motion  of  the  system  and  the  square  o Ota .«««»  S'™ 
the  -probability  of  the  f s having  specified  values  at  any  tun  . 
a lyZn describable  in  terms  of  canonical  coordinates  — 
we  may  use  Schrodinger's  representation  and  can  then  t.kaH 
an  operator  of  differentiation  in  accordance  with  (42)  of  § 2- 

28.  Heisenberg’s  form  for  the  equations  of  motion 

In  the  preceding  section  we  set  up  a picture  of  the  states  of 
undisturbed  motion  by  making  each  of  them  comspond  to  a movmg 
ket  the  state  at  any  time  corresponding  to  the  ket  at  that  ti  . 
shall  call  this  the  Schrodinger  picture.  L * us  apply  to  our  kets  the 
unitary  transformation  which  makes  each  ket  |o>  go  over  i 

\a*y  = T-1  |o>.  (8) 

Tin.  transformation  ,«  of  the  form  given  hy  (75,  of : S » 

U but  it  depends  on  the  time  t since  f depen  s oil  . _ . 

pictured  as  the  application  of  a continuous  motion  (consisting  of 
rotations  and  uniform  deformations)  to  the  whole  ke*  v““r  8^“>g 
A ket  which  is  originally  fixed  becomes  a moving  one,  its  motion  being 
given  by  (8)  with  |«>  independent  of  (.  On  the  other  hand  a ket 
thich  is  originally  moving  to  correspond  to  a state  of  undisturbe 
motion,  i.e.  in  accordance  with  equation  (1),  becomes  fixed  since  on 
substituting  | Pt>  for  |a>  in  (8)  we  get  |a*>  independent  of  t.  Thus 
the  transformation  brings  the  kets  corresponding  to  states  of  undisturbed 

"thl  mdtary  transformation  must  be  applied  also  to  bras  and  linear 
operators,  in  order  that  equations  between  the  various  quantities  may 
remain  invariant.  The  transformation  applied  to  bras  is  given  by  e 
conjugate  imaginary  of  (8)  and  applied  to  Unear  operators  it  is  given 

by  (70)  of  § 26  with  T_1  for  U,  i.e. 

a*  = T-'<xT. 


(9) 
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A linear  operator  which  is  originally  fixed  transforms  into  a moving 
linear  operator  in  general.  Now  a dynamical  variable  corresponds  to 
a linear  operator  which  is  originally  fixed  (because  it  does  not  refer 
to  t at  all),  so  after  the  transformation  it  corresponds  to  a moving 
linear  operator.  The  transformation  thus  leads  us-  to  a new  picture 
of  the  motion,  in  which  the  states  correspond  to  fixed  vectors  and 
the  dynamical  variables  to  moving  linear  operators.  We  shall  call 
this  the  Heisenberg  picture. 

The  physical  condition  of  the  dynamical  system  at  any  time 
involves  the  relation  of  the  dynamical  variables  to  the  state,  and 
the  change  of  the  physical  condition  with  time  may  be  ascribed 
either  to  a change  in  the  state,  with  the  dynamical  variables  kept 
fixed,  which  gives  us  the  Schrodinger  picture,  or  to  a change  in  the 
dynamical  variables,  with  the  state  kept  fixed,  which  gives  us  the 
Heisenberg  picture. 

In  the  Heisenberg  picture  there  are  equations  of  motion  for  the 
dynamical  variables.  Take  a dynamical  variable  corresponding  to 
the  fixed  linear  operator  v in  the  Schrodinger  picture.  In  the  Heisen- 
berg picture  it  corresponds  to  a moving  linear  operator,  which  we 
write  as  v,  instead  of  v*,  to  bring  out  its  dependence  on  t,  and  which 

(10) 


(11) 

(12) 


is  given  by  ^ = 

or  Tv,  — vT 

Differentiating  with  respect  to  t,  we  get 

dT  -dv,  dT 

dt  1 dt  dt 


With  the  help  of  (6),  this  gives 


7 dv, 


or 


HTvt+ihTzp  = vHT 
dt 

ih^p  = T-HHT-T-'HTv, 
dt  1 


— ^ 

where  H,  = T-'HT. 

Equation  (11)  may  be  written  in  P.B.  notation 
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Equation  (11)  or  (13)  shows  how  any  dynamical  variable  varies 
with  time  in  the  Heisenberg  picture  and  gives  us  Heisenberg’s  form, 
for  the  equations  of  motion.  These  equations  of  motion  are  determined 
by  the  one  linear  operator  Ht,  which  is  just  the  transform  of  the  linear 
operator  H occurring  in  Schrodinger’s  form  for  the  equations  of 
motion  and  corresponds  to  the  energy  in  the  Heisenberg  picture.  We 
shall  call  the  dynamical  variables  in  the  Heisenberg  picture,  where 
they  vary  With  the  time,  Heisenberg  dynamical  variables,  to  distinguish 
them  from  the  fixed  dynamical  variables  of  the  Schrodinger  picture, 
which  we  shall  call  Schrodinger  dynamical  variables.  Each  Heisenberg 
dynamical  variable  is  connected  with  the  corresponding  Schrodinger 
dynamical  variable  by  equation  (10).  Since  this  connexion  is  a unitary 
transformation,  all  algebraic  and  functional  relationships  are  the 
same  for  both  kinds  of  dynamical  variable.  We  have  T = 1 for 
t = te,  so  that  vu  = v and  any  Heisenberg  dynamical  variable  at  time 
t0  equals  the  corresponding  Schrodinger  dynamical  variable. 

Equation  (13)  can  be  compared  with  classical  mechanics,  where  we 
also  have  dynamical  variables  varying  with  the  time.  The  equations 
of  motion  of  classical  mechanics  can  be  written  in  the  Hamiltonian 

form  dq,  ^8JH  dp1=z_dH_  (14) 

dt  8pr’  dt  8qr  ’ 

where  the  q’s  and  p’s  are  a set  of  canonical  coordinates  and  momenta 
and  H is  the  energy  expressed  as  a function  of  them  and  possibly  also 
of  t.  The  energy  expressed  in  this  way  is  called  the  Hamiltonian. 
Equations  (14)  give,  for  v any  function  of  the  q’s  and  p’s  that  does 
not  contain  the  time  t explicitly, 


dv 

dt 


dv  dq,  8v  dpr\ 

8qr  dt  8pr  dt  j 


dv  8H 
8qr8pr 


8v  8H 1 
8pr  8qJ 


[v,H], 


(15) 


with  the  classical  definition  of  a P.B.,  equation  (1)  of  § 21.  This  is 
of  the  same  form  as  equation  (13)  in  the  quantum  theory.  We  thus 
get  an  analogy  between  the  classical  equations  of  motion  in  the 
Hamiltonian  form  and  the  quantum  equations  of  motion  in  Heisen- 
berg’s form.  This  analogy  provides  a justification  for  the  assumption 
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that  the  linear  operator  H introduced  in  the  preceding  section  is  the 
energy  of  the  system  in  quantum  mechanics. 

In  classical  mechanics  a dynamical  system  is  defined  mathemati- 
cally when  the  Hamiltonian  is  given,  i.e.  when  the  energy  is  given 
in  terms  of  a set  of  canonical  coordinates  and  momenta,  as  this  is 
sufficient  to  fix  the  equations  of  motion.  In  quantum  mechanics  a 
dynamical  system  is  defined  mathematically  when  the  energy  is 
given  in  terms  of  dynamical  variables  whose  commutation  relations 
are  known,  as  this  is  then  sufficient  to  fix  the  equations  of  motion, 
in  both  Schrodinger’e  and  Heisenberg’s  form.  We  need  to  have 
either  H expressed  in  terms  of  the  Schrodinger  dynamical  variables 
or  Ht  expressed  in  terms  of  the  corresponding  Heisenberg  dynamical 
variables,  the  functional  relationship  being,  of  course,  the  same  in 
both  cases.  We  call  the  energy  expressed  in  this  way  the  Hamiltonian 
of  the  dynamical  system  in  quantum  mechanics,  to  keep  up  the 
analogy  with  the  classical  theory. 

A system  in  quantum  mechanics  always  has  a Hamiltonian,  whether 
the  system  is  one  that  has  a classical  analogue  and  is  describable  in 
terms  of  canonical  coordinates  and  momenta  or  not.  However,  if  the 
system  does  have  a classical  analogue,  its  connexion  with  classical 
mechanics  is  specially  close  and  one  can  usually  assume  that  the 
Hamiltonian  is  the  same  function  of  the  canonical  coordinates  and 
momenta  in  the  quantum  theory  as  in  the  classical  theory,  f There 
would  be  a difficulty  in  this,  of  course,  if  the  classical  Hamiltonian 
involved  a product  of  factors  whose  quantum  analogues  do  not  com- 
mute, as  one  would  not  know  in  which  order  to  put  these  factors  in 
the  quantum  Hamiltonian,  but  this  does  not  happen  for  most  of  the 
elementary  dynamical  systems  whose  study  is  important  for  atomic 
physics.  In  consequence  we  are  able  also  largely  to  use  the  same 
language  for  describing  dynamical  systems  in  the  quantum  theory  as 
in  the  classical  theory  (e.g.  to  talk  about  particles  with  given  masses 
moving  through  given  fields  of  force),  and  when  given  a system  in 
classical  mechanics,  can  usually  give  a meaning  to  ‘the  same’  system 
in  quantum  mechanics. 

Equation  (13)  holds  for  vt  any  function  of  the  Heisenberg  dynamical 
variables  not  involving  the  time  explicitly,  i.e.  for  v any  constant 

f This  assumption  is  found  in  practice  to  be  successful  only  when  applied  with  the 
dynamical  coordinates  and  momenta  referring  to  a Cartesian  system  of  axes  and  not 
to  more  general  curvilinear  coordinates. 
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linear  operator  in  the  Schrodinger  picture.  It  shows  that  such  a 
function  vt  is  constant  if  it  commutes  with  Ht  or  if®  commutes  with  11. 

We  then  have  « 

Vi  = V t>  = V, 

and  we  call  vt  or  v a constant  of  the  motion.  It  is  necessary  that  v shall 
commute  with  H at  all  times,  which  is  usually  possible  only  if  H is 
constant.  In  this  case  we  can  substitute  H for  v in  (13)  and  deduce 
that  Ht  is  constant,  showing  that  H itself  is  then  a constant  of  the 
motion.  Thus  if  the  Hamiltonian  is  constant  in  the  Schrodinger 
picture,  it  is  also  constant  in  the  Heisenberg  picture. 

For  an  isolated  system,  a system  not  acted  on  by  any  external 
forces,  there  are  always  certain  constants  of  the  motion.  One  of  these 
is  the  total  energy  or  Hamiltonian.  Others  are  provided  by  the 
displacement  theory  of  § 25.  It  is  evident  physically  that  the  total 
energy  must  remain  unchanged  if  all  the  dynamical  variables  are 
displaced  in  a certain  way,  so  equation  (63)  of  § 25  must  hold  with 
vd  = v — H.  Thus  D commutes  with  H and  is  a constant  of  the 
motion.  Passing  to  the  case  of  an  infinitesimal  displacement,  we  see 
that  the  displacement  operators  dx,  du,  and  d.  are  constants  of  the 
motion  and  hence,  from  (69)  of  § 25,  the  total  momentum  is  a constant 
of  the  motion.  Again,  the  total  energy  must  remain  unchanged  if  all 
the  dynamical  variables  are  subjected  to  a certain  rotation.  This 
leads,  as  will  be  shown  in  § 35,  to  the  result  that  the  total  angular 
momentum  is  a constant  of  the  motion.  The  laws  of  conservation  of 
energy,  momentum,  and  angular  momentum  hold  for  an  isolated  system 
in  the  Heisenberg  picture  in  quantum  mechanics,  as  they  hold  in 
classical  mechanics. 

Two  forms  for  the  equations  of  motion  of  quantum  mechanics  have 
now  been  given.  Of  these,  the  Schrodinger  form  is  the  more  useful 
one  for  practical  problems,  as  it  provides  the  simpler  equations.  The 
unknowns  in  Schrodinger’s  wave  equation  are  the  numbers  which 
form  the  representative  of  a ket  vector,  while  Heisenberg’s  equation 
of  motion  for  a dynamical  variable,  if  expressed  in  terms  of  a repre- 
sentation, would  involve  as  unknowns  the  numbers  forming  the 
representative  of  the  dynamical  variable.  The  latter  are  far  more 
numerous  and  therefore  more  difficult  to  evaluate  than  the  Schro- 
dinger  unknowns.  Heisenberg’s  form  for  the  equations  of  motion  is 
of  value  in  providing  an  immediate  analogy  with  classical  mechanics 
and  enabling  one  to  see  how  various  features  of  classical  theory,  such 


§ 28  HEISENBERG’S  FORM  FOR  THE  EQUATIONS  OF  MOTION  115 

linear  operator  in  the  Schrodinger  picture.  It  shows  that  such  a 
function  vt  is  constant  if  it  commutes  with  Ht  or  if  v commutes  with  H. 
We  then  have 

vt  = 

and  we  call  v(  or  v a constant  of  the  motion.  It  is  necessary  that  v shall 
commute  with  H at  all  times,  which  is  usually  possible  only  if  H is 
constant.  In  this  case  we  can  substitute  H for  v in  (13)  and  deduce 
that  Hf  is  constant,  showing  that  H itself  is  then  a constant  of  the 
motion.  Thus  if  the  Hamiltonian  is  constant  in  the  Schrodinger 
picture,  it  is  also  constant  in  the  Heisenberg  picture. 

For  an  isolated  system,  a system  not  acted  on  by  any  external 
forces,  there  are  always  certain  constants  of  the  motion.  One  of  these 
is  the  total  energy  or  Hamiltonian.  Others  are  provided  by  the 
displacement  theory  of  § 25.  It  is  evident  physically  that  the  total 
energy  must  remain  unchanged  if  all  the  dynamical  variables  are 
displaced  in  a certain  way,  so  equation  (63)  of  § 25  must  hold  with 
vd  = v — H.  Thus  D commutes  with  II  and  is  a constant  of  the 
motion.  Passing  to  the  case  of  an  infinitesimal  displacement,  we  see 
that  the  displacement  operators  dx,  du,  and  d.  are  constants  of  the 
motion  and  hence,  from  (69)  of  § 25,  the  total  momentum  is  a constant 
of  the  motion.  Again,  the  total  energy  must  remain  unchanged  if  all 
the  dynamical  variables  are  subjected  to  a certain  rotation.  This 
leads,  as  will  be  shown  in  § 35,  to  the  result  that  the  total  angular 
momentum  is  a constant  of  the  motion.  The  laws  of  conservation  of 
energy,  momentum,  and  angular  momentum,  hold  for  an  isolated  system 
in  the  Heisenberg  picture  in  quantum  mechanics,  as  they  hold  in 
classical  mechanics. 

Two  forms  for  the  equations  of  motion  of  quantum  mechanics  have 
now  been  given.  Of  these,  the  Schrodinger  form  is  the  more  useful 
one  for  practical  problems,  as  it  provides  the  simpler  equations.  The 
unknowns  in  Schrodinger’s  wave  equation  are  the  numbers  which 
form  the  representative  of  a ket  vector,  while  Heisenberg’s  equation 
of  motion  for  a dynamical  variable,  if  expressed  in  terms  of  a repre- 
sentation, would  involve  as  unknowns  the  numbers  forming  the 
representative  of  the  dynamical  variable.  The  latter  are  far  more 
numerous  and  therefore  more  difficult  to  evaluate  than  the  Schro- 
dinger unknowns.  Heisenberg’s  form  for  the  equations  of  motion  is 
of  value  in  providing  an  immediate  analogy  with  classical  mechanics 
and  enabling  one  to  see  how  various  features  of  classical  theory,  such 
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as  the  conservation  laws  referred  to  above,  are  translated  into  quan- 
tum theory. 

29.  Stationary  states 

We  shall  here  deal  with  a dynamical  system  whose  energy  is  con- 
stant. Certain  specially  simple  relations  hold  for  this  case.  Equation 
(6)  can  be  integrated")"  to  give 

rp  _ 

with  the  help  of  the  initial  condition  that  T = 1 for  t = t0.  This 
result  substituted  into  (1)  gives 

I Pty  = Pt0y,  (16) 

which  is  the  integral  of  Schrodinger’s  equation  of  motion  (5),  and 
substituted  into  (10)  it  gives 

v,  = (17) 

which  is  the  integral  of  Heisenberg ’s  equation  of  motion  (11),  H,  being 
now  equal  to  H.  Thus  we  have  solutions  of  the  equations  of  motion 
in  a simple  form.  However,  these  solutions  are  not  of  much  practical 
Value,  because  of  the  difficulty  involved  in  evaluating  the  operator 
uniess  H is  particularly  simple,  and  for  practical  purposes 
one  usually  has  to  fall  back  on  Schrodinger’s  wave  equation. 

Let  us  consider  a state  of  motion  such  that  at  time  <0  it  is  an  eigen- 
state of  the  energy.  The  ket  |P<0>  corresponding  to  it  at  this  time 
must  be  an  eigenket  of  H.  If  H'  is  the  eigenvalue  to  which  it  belongs, 
equation  (16)  gives 

showing  that  |Pf>  differs  from  |P<0>  only  by  a phase  factor.  Thus 
the  state  always  remains  an  eigenstate  of  the  energy,  and  further,  it 
does  not  vary  with  the  time  at  all,  since  the  direction  of  the  ket  \Pt} 
does  not  vary  with  the  time.  Such  a state  is  called  a stationary  state. 
The  probability  for  any  particular  result  of  an  observation  on  it  is 
independent  of  the  time  when  the  observation  is  made.  From  our 
assumption  that  the  energy  is  an  observable,  there  are  sufficient 
stationary  states  for  an  arbitrary  state  to  be  dependent  on  them. 

The  time-dependent  wave  function  i/i(£t)  representing  a stationary 
state  of  energy  H'  will  vary  with  time  according  to  the  law 

m)  = ut)e-iHm>  (is) 

t The  integration  can  be  carried  out  as  though  H were  an  ordinary  algebraic 
variable  instead  of  a linear  operator,  because  there  is  no  quantity  that  does  not 
commute  with  H in  the  work. 
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and  Schrodinger’s  wave  equation  (7)  for  it  reduces  to 

#'<Ao>  = #<Ao>-  (19) 

This  equation  merely  asserts  that  the  state  represented  by  >p0  is  an 
eigenstate  of  H.  W e call  a function  >Jj0  satisfying  (19)  an  eigenfunction 
of  II,  belonging  to  the  eigenvalue  II' . 

In  the  Heisenberg  picture  the  stationary  states  correspond  to  fixed 
eigenvectors  of  the  energy.  We  can  set  up  a representation  in  which 
all  the  basic  vectors  are  eigenvectors  of  the  energy  and  so  correspond 
to  stationary  states  in  the  Heisenberg  picture.  We  call  such  a repre- 
sentation a Heisenberg  representation.  The  first  form  of  quantum 
mechanics,  discovered  by  Heisenberg  in  1925,  was  in  terms  of  a 
representation  of  this  kind.  The  energy  is  diagonal  in  the  representa- 
tion. Any  other  diagonal  dynamical  variable  must  commute  with  the 
energy  and  is  therefore  a constant  of  the  motion.  The  problem  of 
setting  up  a Heisenberg  representation  thus  reduces  to  the  problem 
of  finding  a complete  set  of  commuting  observables,  each  of  which 
is  a constant  of  the  motion,  and  then  making  these  observables 
diagonal.  The  energy  must  be  a function  of  these  observables,  from 
Theorem  2 of  § 19.  It  is  sometimes  convenient  to  take  the  energy 
itself  as  one  of  them. 

Let  a denote  the  complete  set  of  commuting  observables  in  a 
Heisenberg  representation,  so  that  the  basic  vectors  are  written  <a'|, 
[“">•  The  energy  is  a function  of  these  observables  a,  say  H = H(a). 
From  (17)  we  get 

<«>,!«">  = <a'  |a"> 

= (20) 
where  H'  = H(ot)  and  H"  — II (a").  The  factor  <ac'|v|a">  on  the  right- 
hand  side  here  is  independent  of  t,  being  an  element  of  the  matrix 
representing  the  fixed  linear  operator  v.  Formula  (20)  shows  how  the 
Heisenberg  matrix  elements  of  any  Heisepberg  dynamical  variable 
vary  with  time,  and  it  makes  vt  satisfy  the  equation  of  motion  (11), 
as  is  easily  verified.  The  variation  given  by  (20)  is  simply  periodic 
with  the  frequency 

\H'-H"\l2nh=  \H’-H"\/h,  . (21) 

depending  only  on  the  energy  difference  of  the  two  stationary  states 
to  which  the  matrix  element  refers.  This  result  is  closely  connected 
with  the  Combination  Law  of  Spectroscopy  and  Bohr’s  Frequency 

3595-67  t 
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Condition,  according  to  which  (21)  is  the  frequency  of  the  electro- 
magnetic radiation  emitted  or  absorbed  when  the  system  makes  a 
transition  under  the  influence  of  radiation  between  the  stationary 
states  a'  and  a",  the  eigenvalues  of  H being  Bohr’s  energy  levels. 
These  matters  will  be  dealt  with  in  § 45. 


30.  The  free  particle 

The  most  fundamental  and  elementary  application  of  quantum 
mechanics  is  to  the  system  consisting  merely  of  a free  particle,  or 
particle  not  acted  on  by  any  forces.  For  dealing  with  it  we  use  as 
dynamical  variables  the  three  Cartesian  coordinates  x,  y,  z and  their 
conjugate  momenta  px,  py,  pz.  The  Hamiltonian  is  equal  to  the 
kinetic  energy  of  the  particle,  namely 

H = ^(Pl+rl+pl)  (22) 

according  to  Newtonian  mechanics,  m being  the  mass.  This  formula 
is  valid  only  if  the  velocity  of  the  particle  is  small  compared  with  c, 
the  velocity  of  light.  For  a rapidly  moving  particle,  such  as  we  often 
have  to  deal  with  in  atomic  theory,  (22)  must  be  replaced  by  the 
relativistic  formula 

H = c(m2c2+p2x-{-pl+pl)K  (23) 

For  small  values  of  px,  py,  and  pz  (23)  goes  over  into  (22),  except  for 
the  constant  term  me2  which  corresponds  to  the  rest-energy  of  the 
particle  in  the  theory  of  relativity  and  which  has  no  influence  on  the 
equations  of  motion.  Formulas  (22)  and  (23)  can  be  taken  over 
directly  into  the  quantum  theory,  the  square  root  in  (23)  being  now 
understood  as  the  positive  square  root  defined  at  the  end  of  § 11. 
The  constant  term  me 2 by  which  (23)  differs  from  (22)  for  small  values 
of  px,  py,  and  pz  can  still  have  no  physical  effects,  since  the  Hamil- 
tonian in  the  quantum  theory,  as  introduced  in  § 27,  is  undefined  to 
the  extent  of  an  arbitrary  additive  real  constant. 

We  shall  here  work  with  the  more  accurate  formula  (23).  We  shall 
first  solve  the  Heisenberg  equations  of  motion.  From  the  quantum 
conditions  (9)  of  § 21,  px  commutes  with  py  and  pz,  and  hence,  from 
Theorem  1 of  § 19  extended  to  a set  of  commuting  observables,  px 
commutes  with  any  function  of  px,  py,  and  pz  and  therefore  with  H. 
It  follows  that  px  is  a constant  of  the  motion.  Similarly  py  and  pz  are 
constants  of  the  motion.  These  results  are  the  same  as  in  the  classical 
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theory.  Again,  the  equation  of  motion  for  a coordinate,  xt  say,  is, 
according  to  (11), 

dr  * 

ihxt  = ih  = xic(m2c2-\-pl+2>l+Pl)i—c(m'ic2+pl+pl+pl)iXf 


The  right-hand  side  here  can  be  evaluated  by  means  of  formula 
(31)  of  § 22  with  the  roles  of  coordinates  and  momenta  interchanged, 
so  that  it  reads 


iff -far  = ih  8fl8Pn 
f now  being  any  function  of  the  p’s.  This  gives 
8 


£‘  ~ = r jf- 


Similarly, 


lit 


_ c'Py 
H ’ 


2,  = 


C2p, 

if 


The  magnitude  of  the  velocity  is 


(24) 


(25) 


v = (tf+yt+tff  = cHpI+pI+pIV/h.  (26) 

Equations  (25)  and  (26)  are  just  the  same  as  in  the  classical  theory. 

Let  us  consider  a state  that  is  an  eigenstate  of  the  momenta, 
belonging  to  the  eigenvalues  p'x,  p'y,  p'z.  This  state  must  be  an  eigen- 
state of  the  Hamiltonian,  belonging  to  the  eigenvalue 

H'  = c(mW+p'*+p'*+p'*)',  (27) 

and  must  therefore  be  a stationary  state.  The  possible  values  for  H' 
are  all  numbers  from  me 2 to  oo,  as  in  the  classical  theory.  The  wave 
function  i/)(xyz)  representing  this  state  at  any  time  in  Schrodinger’s 
representation  must  satisfy 

p'x4>{xyz)y  = pj,{xyz)y  = - 


with  similar  equations  for  pg  and  pz.  These  equations  show  that 
>p{xyz)  is  of  the  form 

ifj{xyz)  = ae^'rx+P',y+P'^h,  (28) 

where  a is  independent  of  x,  y,  and  z.  From  (18)  we  see  now  that  the 
time-dependent  wave  function  >p{xyzt)  is  of  the  form 

4>{xyzt)  = agei<^x+Kv+P‘z-H'l)’,'\  (29) 

where  a0  is  independent  of  x,  y,  z,  and  t. 

The  function  (29)  of  x,  y,  z,  and  t describes  plane  waves  in  space- 
time.  We  see  from  this  example  the  suitability  of  the  terms  ‘wave 
function’  and  ‘wave  equation’.  The  frequency  of  the  waves  is 

>'  = H'/h, 


(30) 
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their  wavelength  is 

■'  A = */■?',  (31) 

P'  being  the  length  of  the  vector  (p'x,p'v,p’z),  and  their  motion  is  in 
the  direction  specified  by  the  vector  (p'x,p'y,p'z)  with  the  velocity 

Av  = H'jP'  = c2/v',  (32) 

v'  being  the  velocity  of  the  particle  corresponding  to  the  momentum 
(Px’P'v’Pz)  as  given  by  formula  (26).  Equations  (30),  (31),  and  (32) 
are  easily  seen  to  hold  in  all  Lorentz  frames  of  reference,  the  expres- 
sion on  the  right-hand  side  of  (29)  being,  in  fact,  relativistically 
invariant  with  p'x,p'v,p'z  and  H'  as  the  components  of  a 4-vector. 
These  properties  of  relativistic  invariance  led  de  Broglie,  before  the 
discovery  of  quantum  mechanics,  to  postulate  the  existence  of  waves 
of  the  form  (29)  associated  with  the  motion  of  any  particle.  They 
are  therefore  known  as  de  Broglie  waves. 

In  the  limiting  case  when  the  mass  m is  made  to  tend  to  zero,  the 
classical  velocity  of  the  particle  v becomes  equal  to  c and  hence,  from 
(32),  the  wave  velocity  also  becomes  c.  The  waves  are  then  like  the 
light-waves  associated  with  a photon,  with  the  difference  that  they 
contain  no  reference  to  the  polarization  and  involve  a complex  ex- 
ponential instead  of  sines  and  cosines.  Formulas  (30)  and  (31)  are 
still  valid,  connecting  the  frequency  of  the  light-waves  with  the 
energy  of  the  photon  and  the  wavelength  of  the  fight-waves  with 
the  momentum  of  the  photon. 

For  the  state  represented  by  (29),  the  probability  of  the  particle 
being  found  in  any  specified  small  volume  when  an  observation  of  its 
position  is  made  is  independent  of  where  the  volume  is.  This  provides 
an  example  of  Heisenberg’s  principle  of  uncertainty,  the  state  being 
one  for  which  the  momentum  is  accurately  given  and  for  which,  in 
consequence,  the  position  is  completely  unknown.  Such  a state  is, 
of  course,  a limiting  case  which  never  occurs  in  practice.  The  states 
usually  met  with  in  practice  are  those  represented  by  wave  packets, 
which  may  be  formed  by  superposing  a number  of  waves  of  the  type 
(29)  belonging  to  slightly  different  values  of  (p’x,py,pz),  as  discussed 
in  § 24.  The  ordinary  formula  in  hydrodynamics  for  the  velocity  of 
such  a wave  packet,  i.e.  the  group  velocity  of  the  waves,  is 

dv 


(33) 
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which  gives,  from  (30)  and  (31) 

AH'  d 


c2P'  i 

•jpCv+iT-Tr  — 


(34) 


a Jr 

. •*.  f r,prtiple  The  wave  packet  moves  m 

This  b i>«t  the  velocty  „ the  particle  moves 

the  same  direction  and  with  the  same  vemo  .y 

in  classical  mechanics. 

Thp  motion  of  wave  packets 

• by  ,:jr»^w»ii!who°e 

ordinates  and  momenta  of  unceItainty.  Now 

accuracy  is  limited  by  H o vp  nacket  varies  with 

its  Hamiltonian  be  H(qr,pr)  ( ’ - r,/0  „|say  obtained 

rldyr‘t ic^;e““r  ».  ?£ * * 
:r:z:r:  i rrr  * ™ »>.  - r t — ■ 

dinger’s  representation  is  of  the  form 

</t(g«)  -=  AeiSin,  u 

where  J and  5 are  real  functions  of  «ie 

respectively.  Schrodinger’s  wave  equation  ( ) g 
ih^Ae^>  = H(f/r,pr)e4e^‘> 


ae 


or 


= e-iSinH(qr,PMei8l*>- 

\ St  dt) 


(36) 
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Now  e~,sifi  is  evidently  a unitary  linear  operator  and  may  be  used  for 
U in  equation  (/0)  of  § 20  to  give  us  a unitary  transformation.  The 
q s remain  unchanged  by  this  transformation,  each  pr  goes  over  into 
e~imPreisin  = Pr+dSldqr, 
with  the  help  of  (31)  of  § 22,  and  H goes  over  into 
e~islnH(qr,  pr)eiS!h  = H(qr,pr+8S/8qr), 

since  algebraic  relations  are  preserved  by  the  transformation.  Thus 
(36)  becomes 


i 


iti 


8A 

8t 


-tSb-irf, 


ct  y 


= H[1r’Pr  + 


dS 


>■ 


(37) 


T.et  us  now  suppose  that  h can  be  counted  as  small  and  let  us  neglect 
wms  involving  h in  (37).  This  involves  neglecting  the pT\ s that  occur 
a H in  (37),  since  each  pT  is  equivalent  to  the  operator  -ih8/8q 

operating  on  the  functions  of  the  q’s  to  the  right  of  it.  The  surviving 
erms  give  „ „ , 6 

8S  8S\ 

= (38) 

This  is  a differential  equation  which  the  phase  function  8 has  to 
satisfy.  The  equation  is  determined  by  the  classical  Hamiltonian 
function  Hc  and  is  known  as  the  Hamilton- Jacobi  equation  in  classical 
dynamics.  It  allows  S to  be  real  and  so  shows  that  the  assumption 
of  the  wave  form  (35)  does  not  lead  to  an  inconsistency. 

To  obtain  an  equation  for  A,  we  must  retain  the  terms  in  (37) 
w uc  i are  linear  in  h and  see  what  they  give.  A direct  evaluation  of 
these  terms  is  rather  awkward  in  the  case  of  a general  function  H, 
and  we  can  get  the  result  we  require  more  easily  by  first  multiplying 
both  sides  of  (37)  by  the  bra  vector  (Af,  where /is  an  arbitrary  real 
function  of  the  q s.  This  gives 


<Afit 


,8A 

8t 


-A**] 
8t ) 


> 


I he  conjugate  complex  equation  is 


<4fx(qr,Pr+~)A>. 


<4f{- 


£A  . 8S 

-ih~ A 


8t  8t 

Subtracting  and  dividing  out  by  ih,  we  obtain 

8 


2<Af8Ay  = 


<A 


-}>  = <AH 
y ih 


<lr,Pr  + 


S1ri\ 


Ay. 


(39) 
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We  now  have  to  evaluate  the  P.B. 

lf>H(qr,pr+dS/dqT)]. 

Our  assumption  that  h can  be  counted  as  small  enables  us  to  expand 
H(qr,Pr+SS/8qr ) as  a power  series  in  the  p's.  The  terms  of  zero  degree 
will  contribute  nothing  to  the  P.B.  The  terms  of  the  first  degree  in 
the  p’s  give  a contribution  to  the  P.B.  which  can  be  evaluated  most 
easily  with  the  help  of  the  classical  formula  (1)  of  § 21  (this  formula 
being  valid  also  in  the  quantum  theory  if  u is  independent  of  the  p's 
and  v is  linear  in  the  p’s).  The  amount  of  this  contribution  is 

8p*  J 

the  notation  meaning  that  we  must  substitute  8S/8qr  for  each  pr  in 
,e  function  [ ] of  the  q’s  and  p’s,  so  as  to  obtain  a function  of  theVs 
only.  The  terms  of  higher  degree  in  the  p’s  give  contributions  to  the 
P.B.  which  vanish  when  h ->  0.  Thus  (39)  becomes,  with  neglect  of 
terms  involving  h,  which  is  equivalent  to  the  neglect  of  ft2  in  (37), 


dt  4 °qs  8ps 


> 


11,-HSIOq , 


(40) 


Now  if  a(q)  and  b{q)  are  any  two  functions  of  the  qs,  formula 
(64)  of  § 20  gives  <o(?)W?)>  = J a{q.)d,.Hq% 


and  so 


«»<«)^>  = 
Hr 


(41) 


provided  a(q)  and  b(q)  satisfy  suitable  boundary  conditions,  as  dis- 
cussed in  §§  22  and  23.  Hence  (40)  may  be  written 


</— > 

dt  7 


■ //Va  \ A2\SHMr,PrY 

4 d(h\  dps  _ 


Pr-OSldq, 


>• 


Since  this  holds  for  an  arbitrary  real  function/,  we  must  have 

8A2 


dt 


' = _ y 8 i A2\dHo^PrY 
4^'  L 8Ps  \p. 


■eSldq, 


(42) 


This  is  the  equation  for  the  amplitude  A of  the  wave  function.  To 
get  an  understanding  of  its  significance,  let  us  suppose  we  have  a fluid 
moving  in  the  space  ot  the  variables  q , the  density  of  the  fluid  at  any 
point  and  time  being  ^42  and  its  velocity  being 

dl*  = \dHMr-Pr)~\ 

L SPs  \pr=0SI<>qr 


(43) 


124 


THE  EQUATIONS  OF  MOTION  §31 

Equation  (42)  is  then  just  the  equation  of  conservation  for  such  a 
uid.  The  motion  of  the  fluid  is  determined  by  the  function  8 

satisfying  (38),  there  being  one  possible  motion  for  each  solution 
of  (38). 

For  a given  S,  let  us  take  a solution  of  (42)  for  which  at  some 
definite  time  the  density  A 2 vanishes  everywhere  outside  a certain 
small  region.  We  may  suppose  this  region  to  move  with  the  fluid, 
its  velocity  at  each  point  being  given  by  (43),  and  then  the  equation 
of  conservation  (42)  will  require  the  density  always  to  vanish  outside 
the  region.  There  is  a limit  to  how  small  the  region  may  be,  imposed 
by  the  approximation  we  made  in  neglecting  h in  (39).  This  approxi- 
mation is  valid  only  provided  * 


or 


r O i . <70  . 

n~A  < —A, 
eqr  8qr 

i as 

A bqr  * h 8q/ 


w rich  requires  that  A shall  vary  by  an  appreciable  fraction  of  itself 
only  through  a range  of  the  q’s  in  which  5 varies  by  many  times  fi 
i.e.  a range  consisting  of  many  wavelengths  of  the  wave  function  (35)’ 
Our  solution  is  then  a wave  packet  of  the  type  discussed  in  § 24  and 
remains  so  for  all  time. 

We  thus  get  a wave  function  representing  a state  of  motion  for 
which  the  coordinates  and  momenta  have  approximate  numerical 
values  throughout  all  time.  Such  a state  of  motion  in  quantum 
theory  corresponds  to  the  states  with  which  classical  theory  deals 
Ihe  motion  of  our  wave  packet  is  determined  by  equations  (38)  and 
(43).  From  these  we  get,  defining  ps  as  d,S/8qs, 

dP„  _ d 8S  = 0»£  ^ v 82S  d(/u 
dl  8qs  8tdqs  2-,  8qu  8qs  dt 


dt 


V 8*8  8Hc(q,,pr) 
^dqudqs  8pu 


= - dHMr,Pr) 

8qs  ~ ’ (44) 

where  m the  last  line  the  p’s  are  counted  as  independent  of  the  q’a 
eforo  the  partial  differentiation.  Equations  (43)  and  (44)  are  just 
the  classical  equations  of  motion  in  Hamiltonian  form  and  show  that 
the  wave  packet  moves  according  to  the  laws  of  classical  mechanics. 
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W e see  in  tins  way  liow  the  classical  equations  of  motion  are  derivable 
from  the  quantum  theory  as  a limiting  case. 

By  a more  accurate  solution  of  the  wave  equation  one  can  show 
that  the  accuracy  with  which  the  coordinates  and  momenta  simul- 
taneously have  numerical  values  cannot  remain  permanently  as 
favourable  as  the  limit  allowed  by  Heisenberg’s  principle  of  un- 
certainty, equation  (56)  of  § 24,  but  if  it  is  initially  so  it  will  become 
less  favourable,  the  wave  packet  undergoing  a spreading. f 

32.  The  action  principle! 

Equation  (10)  shows  that  the  Heisenberg  dynamical  variables  at 
time  t,  vt,  are  connected  with  their  values  at  time  t0,  vtt,  or  v,  by  a 
unitary  transformation.  The  Heisenberg  variables  at  time  <+§<  are 
connected  with  their  values  at  time  t by  an  infinitesimal  unitary 
transformation,  as  is  shown  by  the  equation  of  motion  (11)  or  (13), 
which  gives  the  connexion  between  vl+Sl  and  vt  of  the  form  of  (79)  or 
(80)  of  § 26  with  Ht  for  F and  8 t/fi  for  «.  The  variation  with  time  of 
the  Heisenberg  dynamical  variables  may  thus  be  looked  upon  as  the 
continuous  unfolding  of  a unitary  transformation.  In  classical 
mechanics  the  dynamical  variables  at  time  t+St  are  connected  with 
their  values  at  time  l by  an  infinitesimal  contact  transformation  and 
the  whole  motion  may  be  looked  upon  as  the  continuous  unfolding  of  a 
contact  transformation.  We  have  here  the  mathematical  foundation 
of  the  analogy  between  the  classical  and  quantum  equations  of 
motion,  and  can  develop  it  to  bring  out  the  quantum  analogue  of  all 
the  main  features  of  the  classical  theory  of  dynamics. 

Suppose  we  have  a representation  in  which  the  complete  set  of 
commuting  observables  $ are  diagonal,  so  that  a basic  bra  is  <£'|. 
We  can  introduce  a second  representation  in  which  the  basic  bras  are 

<f*i  = <f  | T.  (45) 

The  new  basic  bras  depend  on  the  time  t and  give  us  a moving 
representation,  like  a moving  system  of  axes  in  an  ordinary  vector 
space.  Comparing  (45)  with  the  conjugate  imaginary  of  (8),  we  see 
that  the  new  basic  vectors  are  just  the  transforms  in  the  Heisenberg 
picture  of  the  original  basic  vectors  in  the  Schrodinger  picture,  and 
hence  they  must  be  connected  with  the  Heisenberg  dynamical 

258  See  Kennard’  Z'^'  Physlk’  44  (1927),  341;  Darwin,  Proc.  Roy.  Soc.  A,  117  (1927), 
higher  dynw^cs  ^ °mitted  by  tho  student  who  is  not  specially  concerned  with 
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variables  vt  in  the  same  way  in  which  the  original  basic  vectors  are 
connected  with  the  Schrodinger  dynamical  variables  v.  In  particular 
each  <f  *|  must  be  an  eigenvector  of  the  f’s  belonging  to  the  eigen- 
values f.  It  may  therefore  be  written  <f  |,  with' the  understanding 
that  the  numbers  f are  the  same  eigenvalues  of  the  f ’s  that  the  P’s 
are  of  the  f s.  From  (45)  we  get 

<fiir>  = (46) 

showing  that  the  transformation  function  is  just  the  representative 
of  T in  the  original  representation. 

Differentiating  (45)  with  respect  to  t and  using  (6),  we  get 


with  the  help  of  (12).  Multiplying  on  the  right  by  any  ket  |o> 
independent  of  t,  we  get 

ihIt  <6»  = dff  <£!*>,  (47) 

if  we  take  for  definiteness  the  case  of  continuous  eigenvalues  for  the 
f s.  Now  equation  (5),  written  in  terms  of  representatives,  reads 

= f<?mr>  dr  <nn>.  (48) 


Since  <?t\Htl ft>  is  the  same  function  of  the  variables  f and  f that 
<e\H\p  is  off  and  f,  equations  (47)  and  (48)  are  of  precisely  the 
same  form,  with  the  variables  f.f  in  (47)  playing  the  role  of  the 
variables  f and  f in  (48)  and  the  function  <f»  playing  the  role 
o the  function  <f  \Pt).  We  can  thus  look  upon  (47)  as  a form  of 
Schrodinger  s wave  equation,  with  the  function  <f»  of  the  variables 
f as  the  wave  function.  In  this  way  Schrodinger’ s wave  equation 
appears  in  a new  light,  as  the  condition  on  the  representative,  in  the 
moving  representation  with  the  Heisenberg  variables  f diagonal,  of  the 
fixed  ket  corresponding  to  a state  in  the  Heisenberg  picture.  The  function 
|a>  owes  its  variation  with  time  to  its  left  factor  <f  j in  contra 
distinction  to  the  function  <f|P<>,  which  owes  its  variation  with  time 
to  its  right  factor  | Pt). 

If  we  put  |o>  = |f>  in  (47),  we  get 

- J <.mwr>  d&  <ar>, 


(49) 
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showing  that  the  transformation  function  <£|f>  satisfies  Schro- 
dinger’s  wave  equation.  Now  |/o  = so  we  must  have 

> - 8(4-^)’  (50) 

the  S function  here  being  understood  as  the  product  of  a number  of 
factors,  one  for  each  ^-variable,  such  as  occurs  for  the  variables 
t i on  the  right-hand  side  of  equation  (34)  of  § 16.  Thus  the 
transformation  function  <fi|f  > is  that  solution  of  Schrodinger’s  wave 
equation  for  which  the  fa  certainly  have  the  values  f at  time  t0 
The  square  of  its  modulus,  KfllDI*. is  the  relative  probability  of  the 
f’s  having  the  values  ft  at  time  t > t0  if  they  certainly  have  the  values 
f at  time  t0.  We  may  write  <£[D  as  <£«lO  and  consider  it  as 
depending  on  t0  as  well  as  on  t.  To  get  its  dependence  on  t0  we  take 
the  conjugate  complex  of  equation  (49),  interchange  t and  t0  and  also 
interchange  single  primes  and  double  primes.  This  gives 


-ir>4<?t\o  = f <«>  < <&mo-  (51) 

(Hq  j 

The  foregoing  discussion  of  the  transformation  function  <£!f  > is 
valid  with  the  f s any  complete  set  of  commuting  observables.  The 
equations  were  written  down  for  the  case  of  the  fa  having  continuous 
eigenvalues,  but  they  would  still  be  valid  if  any  of  the  have 
discrete  eigenvalues,  provided  the  necessary  formal  changes  are  made 
in  them.  Let  us  now  take  a dynamical  system  having  a classical 
analogue  and  let  us  take  the  fa  to  be  the  coordinates  q.  Put 

(q't\q"y  = eiSlr>  (52) 


and  so  define  the  function  S of  the  variables  q't,  q".  This  function  also 
depends  explicitly  on  t.  (52)  is  a solution  of  Schrodinger  s wave 
equation  and,  if  h can  be  counted  as  small,  it  can  be  handled  in  the 
same  way  as  (35)  was.  The  S of  (52)  differs  from  the  8 of  (35)  on 
account  of  there  being  no  A in  (52),  which  makes  the  8 of  (52)  com- 
plex, but  the  real  part  of  this  8 equals  the  8 of  (35)  and  its  pure 
imaginary  part  is  of  the  order  h.  Thus,  in  the  limit  h - 0,  the  8 of 
(52)  will  equal  that  of  (35)  and  will  therefore  satisfy,  corresponding 

to  (38),  _dSjSt  = Hc(q'rt, p'rt) , (53) 

where  Pn  = (54) 


and  H„  is  the  Hamiltonian  of  the  classical  analogue  of  our  quantum 
dynamical  system.  But  (52)  is  also  a solution  of  (51)  with  q’a  for  f s, 
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which  is  the  conjugate  complex  of  Schrodinger’s  wave  equation  in  the 
variables  q"  or  q"to.  This  causes  S to  satisfy  alsof 

8Sj8t0  = Hc(q"T,pr),  (55) 

where  p"r  — —88jdq"r.  (56) 

The  solution  of  the  Hamilton-Jacobi  equations  (53),  (55)  is  the 
action  function  of  classical  mechanics  for  the  time  interval  t0  to  t, 
i.e.  it  is  the  time  integral  of  the  Lagrangian  L, 

t 

S = f L(t')  dt'.  (57) 

to 

Thus  the  S defined  by  (52)  is  the  quantum  analogue  of  the  classical  action 
function  and  equals  it  in  the  limit  h 0.  To  get  the  quantum  analogue 
of  the  classical  Lagrangian,  we  pass  to  the  case  of  an  infinitesimal 
time  interval  by  putting  t — <0+8f  and  we  then  have  l?70>  as  the 
analogue  of  eiUf^h.  For  the  sake  of  the  analogy,  one  should  consider 
L(t0)  as  a function  of  the  coordinates  q'  at  time  t0f-St  and  the  co- 
ordinates q"  at  time  t0,  rather  than  as  a function  of  the  coordinates 
and  velocities  at  time  t0,  as  one  usually  does. 

The  principle  of  least  action  in  classical  mechanics  says  that  the 
action  function  (57)  remains  stationary  for  small  variations  of  the  tra- 
jectory of  the  system  which  do  not  alter  the  end  points,  i.e.  for  small 
variations  of  the  q’s  at  all  intermediate  times  between  t0  and  t with  qt 
and  qt  fixed.  Let  us  see  what  it  corresponds  to  in  the  quantum  theory. 

Put  expji  J L(t)  dtjh^j  — exp {iS(tb,ta)/h}  — B(tb,ta),  (58) 

to 

so  that  B(tb,ta)  corresponds  to  (q'tfq'ifi  in  the  quantum  theory.  (We 
here  allow  q'la  and  q’tb  to  denote  different  eigenvalues  of  qta  and  qtl,  to 
save  having  to  introduce  a large  number  of  primes  into  the  analysis.) 
Now  suppose  the  time  interval  t0  -»  t to  be  divided  up  into  a large 
number  of  small  time  intervals  t0  ->  tv  t1  ->  t2,...,  trn_1  -»  tm,  tm  t,  by 
the  introduction  of  a sequence  of  intermediate  times  tv  t2,...,  tm.  Then 
— B(t,tm)B(tm,tm_1)...B(t2,t1)B(t1,t0).  (59) 

The  corresponding  quantum  equation,  which  follows  from  the  pro- 
perty of  basic  vectors  (35)  of  § 16,  is 

(60) 

f For  a more  accurate  comparison  of  transformation  functions  with  classical 
theory,  see  Van  Vleck,  Proc.  Nat . .dead.  14,  178. 
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q'k  being  written  for  q\k  for  brevity.  At  first  sight  there  does  not  seem 
to  be  any  close  correspondence  between  (59)  and  (60).  We  must, 
however,  analyse  the  meaning  of  (59)  rather  more  carefully.  We  must 
regard  each  factor  B as  a function  of  the  q’s  at  the  two  ends  of  the 
time  interval  to  which  it  refers.  This  makes  the  right-hand  side  of 
(59)  a function,  not  only  of  qt  and  qit,  but  also  of  all  the  intermediate 
q’s.  Equation  (59)  is  valid  only  when  we  substitute  for  the  inter- 
mediate q’s  in  its  right-hand  side  their  values  for  the  real  trajectory, 
small  variations  in  which  values  leave  S stationary  and  therefore  also, 
from  (58),  leave  B(t,t0)  stationary.  It  is  the  process  of  substituting 
these  values  for  the  intermediate  q’s  which  corresponds  to  the  inte- 
grations over  all  values  for  the  intermediate  q”s  in  (60).  The  quantum 
analogue  of  the  action  principle  is  thus  absorbed  in  the  composition 
law  (60)  and  the  classical  requirement  that  the  values  of  the  inter- 
mediate q’s  shall  make  S stationary  corresponds  to  the  condition 
in  quantum  mechanics  that  all  values  of  the  intermediate  q s 
are  important  in  proportion  to  their  contribution  to  the  integral 
in  (60). 

Let  us  see  how  (59)  can  be  a limiting  case  of  (60)  for  h small.  We 
must  suppose  the  integrand  in  (60)  to  be  of  the  form  elF^h,  where  F is 
a function  of  -,q'm At  which  remains  continuous  as  h tends 

to  zero,  so  that  the  integrand  is  a rapidly  oscillating  function  when 
h is  small.  The  integral  of  such  a rapidly  oscillating  function  will  be 
extremely  small,  except  for  the  contribution  arising  from  a region  in 
the  domain  of  integration  where  comparatively  large  variations  in 
t,he  q'k  produce  only  very  small  variations  in  F . Such  a region  must 
be  the  neighbourhood  of  a point  where  F is  stationary  for  small  varia- 
tions of  the  q'k.  Thus  the  integral  in  (60)  is  determined  essentially  by 
the  value  of  the  integrand  at  a point  where  the  integrand  is  stationary 
for  small  variations  of  the  intermediate  q” s,  and  so  (60)  goes  over 


into  (59). 

Equations  (54)  and  (56)  express  that  the  variables  q't,p't  are  con- 
nected with  the  variables  q",p"  by  a contact  transformation  and  are 
one  of  the  standard  forms  of  writing  the  equations  of  a contact  trans- 
formation. There  is  an  analogous  form  for  writing  the  equations  of  a 
unitary  transformation  in  quantum  mechanics.  We  get  from  (52),  with 


the  help  of  (45)  of  § 22, 

, , , i •£  3 /„'! rt"\  ^S(q t,q  ) / 'j n»\ 

<<h\Prt\<l  > = <ShW  > = At  I?  >• 


(61) 
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Similarly,  with  the  help  of  (46)  of  § 22, 

— -8Sf:9~<q't\q">- 

Mr  Mr 


(62) 


From  the  general  definition  of  functions  of  commuting  observables, 
we  have  = /(gW)<gJlg'>,  (63) 

where  f(qt)  and  g(q)  are  functions  of  the  qt’s  and  q’s  respectively.  Let 
G(qt,  q)  be  any  function  of  the  qt’s  and  q's  consisting  of  a sum  or 
integral  of  terms  each  of  the  form  f(qt)g(q),  so  that  all  the  qt’s  in  G 
occur  to  the  left  of  all  the  q’s.  Such  a function  we  call  well  ordered. 
Applying  (63)  to  each  of  the  terms  in  G and  adding  or  integrating, 

We  g6t  M,\G(qt,q)\q">  = 0(q't,q")Mt\2">- 

Now  let  us  suppose  each  prt  and  pT  can  be  expressed  as  a well-ordered 
function  of  the  qt’s  and  q’s  and  write  these  functions  prt{qt,  q),pr(<h,  ?)• 
Putting  these  functions  for  G,  we  get 

MAvnW'y  = Prt{itM)MtW"'>, 

Mt\prM>  = 

Comparing  these  equations  with  (61)  and  (62)  respectively,  we  see 
that 


PMtM)  — 

This  means  that 

Prt  = 


Mr,  ’ 

^S(q„q) 

Hr,  ’ 


= -eM& 

M 


Pr 


Hr  ’ 


(64) 


provided  the  right-hand  sides  of  (64)  are  written  as  well-ordered 
functions. 

These  equations  are  of  the  same  form  as  (54)  and  (56),  but  refer  to 
the  non-commuting  quantum  variables  qt,q  instead  of  the  ordinary 
algebraic  variables  q't,  q" . They  show  how  the  conditions  for  a unitary 
transformation  between  quantum  variables  are  analogous  to  the  condi- 
tions for  a contact  transformation  between  classical  variables.  The 
analogy  is  not  complete,  however,  because  the  classical  8 must  be  real 
and  there  is  no  simple  condition  corresponding  to  this  for  the  8 of  (64). 


33.  The  Gibbs  ensemble 

In  our  work  up  to  the  present  we  have  been  assuming  all  along  that 
our  dynamical  system  at  each  instant  of  time  is  in  a definite  state, 
that  is  to  say,  its  motion  is  specified  as  completely  and  accurately  as 
is  possible  without  conflicting  with  the  general  principles  of  the  theory 
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In  the  classical  theory  this  would  mean,  of  course,  that  all  the  coordi- 
nates and  momenta  have  specified  values.  Now  we  may  be  interested 
in  a motion  which  is  specified  to  a lesser  extent  than 'this  maximum 
possible.  The  present  section  will  be  devoted  to  the  methods  to  be 
used  in  such  a case. 

The  procedure  in  classical  mechanics  is  to  introduce  what  is  called 
a Oibbs  ensemble,  the  idea  of  which  is  as  follows.  We  consider  all  the 
dynamical  coordinates  and  momenta  as  Cartesian  coordinates  in  a 
certain  space,  the  -phase  space,  whose  number  of  dimensions  is  twice 
the  number  of  degrees  of  freedom  of  the  system.  Any  state  of  the 
system  can  then  be  represented  by  a point  in  this  space.  This  point 
will  move  according  to  the  classical  equations  of  motion  (14).  Sup- 
pose, now,  that  we  are  not  given  that  the  system  is  in  a definite  state 
at  any  time,  but  only  that  it  is  in  one  or  other  of  a number  of  possible 
states  according  to  a definite  probability  law.  We  should  then  be 
able  to  represent  it  by  a fluid  in  the  phase  space,  the  mass  of  fluid  in 
any  volume  of  the  phase  space  being  the  total  probability  of  the 
system  being  in  any  state  whose  representative  point  lies  in  that 
volume.  Each  particle  of  the  fluid  will  be  moving  according  to  the 
equations  of  motion  (14).  If  we  introduce  the  density  p of  the  fluid 
at  any  point,  equal  to  the  probability  per  unit  volume  of  phase  space 
of  the  system  being  in  the  neighbourhood  of  the  corresponding  state, 
we  shall  have  the  equation  of  conservation 


= ~[p,H].  (65) 

This  may  be  considered  as  the  equation  of  motion  for  the  fluid,  since 
it  determines  the  density  p for  all  time  if  p is  given  initially  as  a 
function  of  the  q’s  and  p’s.  It  is  apart  from  the  minus  sign,  of  the 
same  form  as  the  ordinary  equation  of  motion  (15)  for  a.  dynamical 
variable. 

The  requirement  that  the  total  probability  of  the  system  being  in 
any  state  shall  be  unity  gives  us  a normalizing  condition  for  p 

jjpdqdp=  1,  (66) 


the  integration  being  over  the  whole  of  phase  space  and  the  single 
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differential  dq  or  dp  being  written  to  denote  the  product  of  all  the 
dq’s  or  dp’s.  If  jS  denotes  any  function  of  the  dynamical  variables, 
the  average  value  of  j8  will  be 

JJ  dqdp.  (67) 

It  makes  only  a trivial  alteration  in  the  theory,  but  often  facilitates 
discussion,  if  wo  work  with  a density  p differing  from  the  above  one 
by  a positive  constant  factor,  k say,  so  that  we  have  instead  of  (66) 

p dqdp  = k. 


JJ 


With  this  density  we  can  picture  the  fluid  as  representing  a number 
k of  similar  dynamical  systems,  all  following  through  their  motions 
independently  in  the  same  place,  without  any  mutual  disturbance  or 
interaction.  The  density  at  any  point  would  then  be  the  probable  or 
average  number  of  systems  in  the  neighbourhood  of  any  state  per  unit 
volume  of  phase  space,  and  expression  (67)  would  give  the  average 
total  value  of  j8  for  all  the  systems.  Such  a set  of  dynamical  systems, 
which  is  the  ensemble  introduced  by  Gibbs,  is  usually  not  realizable 
in  practice,  except  as  a rough  approximation,  but  it  forms  all  the 
same  a useful  theoretical  abstraction. 

We  shall  now  see  that  there  exists  a corresponding  density  p 
in  quantum  mechanics,  having  properties  analogous  to  the  above. 
It  was  first  introduced  by  von  Neumann.  Its  existence  is  rather 
surprising  in  view  of  the  fact  that  phase  space  has  no  meaning  in 
quantum  mechanics,  there  being  no  possibility  of  assigning  numerical 
values  simultaneously  to  the  q’s  and  p’s. 

We  consider  a dynamical  system  which  is  at  a certain  time  in  one 
or  other  of  a number  of  possible  states  according  to  some  given 
probability  law.  These  states  may  be  either  a discrete  set  or  a con- 
tinuous range,  or  both  together.  We  shall  here  take  for  definiteness 
the  case  of  a discrete  set  and  suppose  them  labelled  by  a parameter  ra. 
Let  the  normalized  ket  vectors  corresponding  to  them  be  |m>  and  let 
the  probability  of  the  system  being  in  the  rath  state  be  Pm.  We  then 
define  the  quantum  density  p by 

P = 2 (68) 

m 

Let  p be  any  eigenvalue  of  p and  |p'>  an  eigenket  belonging  to  this 
eigenvalue.  Then 

2 lm>-Prn<mlp'>  = P\P">  = P'\P"> 


§ 33 

so  that 
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Z(p'\m>Pm<m\p'y  = P<P'\P> 

ra 

2 P,„i<wip'>|a  = p'<p\p>- 

m 

Now  P being  a probability,  can  never  be  negative.  It  follows  that 
p'  cannot  be  negative.  Thus  p has  no  negative  eigenvalues,  m analogy 
with  the  fact  that  the  classical  density  p is  never  negative. 

Let  us  now  obtain  the  equation  of  motion  for  our  quantum  p.  In 
Schrodinger’s  picture  the  kets  and  bras  in  (68)  will  vary  with  the  time 
in  accordance  with  Schrodinger’s  equation  (5)  and  the  conjugate 
imaginary  of  this  equation,  while  the  Pm’s  will  remain  constant,  since 
the  system,  so  long  as  it  is  left  undisturbed,  cannot  change  over  from 
a state  corresponding  to  one  ket  satisfying  Schrodinger’s  equation  to 
a state  corresponding  to  another.  We  thus  have 

m 

= 2 {H|m>Pm<w|-  |m>Pm<w|H} 

m 

- HP-pH.  (69) 

This  is  the  quantum  analogue  of  the  classical  equation  of  motion 
(65).  Our  quantum  p,  like  the  classical  one,  is  determined  for  all  time 

if  it  is  given  initially.  , , , 

From  the  assumption  of  § 12,  the  average  value  of  any  observable 

B when  the  system  is  in  the  state  m is  <m|0|in>.  Hence  if  the  system 
is  distributed  over  the  various  states  m according  to  the  probability 
law  Pm,  the  average  value  of  jS  will  be  2 P„<«I0I»>-  If  we  introduce 
a representation  with  a discrete  set  of  basic  ket  vectors  |f  > say,  this 
equals  w 

y pm<w!f><f  ys|m>  = 2 > 

= V <£'\8p;t>  = 2 <f  !p0 !f  >,  <“0) 

T f 

the  last  step  being  easily  verified  with  the  law  of  matrix  multiplica- 
tion, equation  (44)  of  § 17.  The  expressions  (70)  are  the  analogue  o 
the  expression  (67)  of  the  classical  theory.  Whereas  m the  e-ssica 
theory  we  have  to  multiply  jS  by  p and  take  the  integral  of  the 
product  over  all  phase  space,  in  the  quantum  theory  we  have  to 
multiply  jS  by  p,  with  the  factors  in  either  order,  and  take  the 

3595.57  K 


134 


THE  EQUATIONS  OF  MOTION 


33 


diagonal  sum  of  tlie  product  in  a representation.  If  the  representa- 
tion involves  a continuous  range  of  basic  vectors  |p>,  we  get  instead 

0t  ^'0)  J <?\P p\?>d?  = J <riPJ8| ?>d?,  (71) 

so  that  we  must  carry  through  a process  of  ‘integrating  along  the 
diagonal  ’ instead  of  summing  the  diagonal  elements.  We  shall  define 
(71)  to  be  the  diagonal  sum  of  f3p  in  the  continuous  case.  It  can  easily 
be  verified,  from  the  properties  of  transformation  functions  (56)  of 
§ 18,  that  the  diagonal  sum  is  the  same  for  all  representations. 

From  the  condition  that  the  jm>’s  are  normalized  we  get,  with 
discrete  £"s 

I <?\p\?>  = 2 <£»Pw<m|f  > = 2 P,n  = 1,  (72) 

S ' £'m  m 

since  the  total  probability  of  the  system  being  in  any  state  is  unity. 
This  is  the  analogue  of  equation  (66).  The  probability  of  the  system 
being  in  the  state  or  the  probability  of  the  observables  £ which 
are  diagonal  in  the  representation  having  the  values  is,  according 
to  the  rule  for  interpreting  representatives  of  kets  (51)  of  § 18, 

2 - <f  \pW >,  (73) 

m 

which  gives  us  a meaning  for  each  term  in  the  sum  on  the  left-hand 
side  of  (72).  For  continuous  £”s,  the  right-hand  side  of  (73)  gives  the 
probability  of  the  |’s  having  values  in  the  neighbourhood  of  per 
unit  range  of  variation  of  the  values 

As  in  the  classical  theory,  we  may  take  a density  equal  to  k times 
the  above  p and  consider  it  as  representing  a Gibbs  ensemble  of  fc 
similar  dynamical  systems,  between  which  there  is  no  mutual  dis- 
turbance or  interaction.  We  shall  then  have  k on  the  right-hand  side 
of  (72),  and  (70)  or  (71)  will  give  the  total  average  fi  for  all  the 
members  of  the  ensemble,  while  (73)  will  give  the  total  probability 
of  a member  of  the  ensemble  having  values  for  its  f s equal  to  £' 
or  in  the  neighbourhood  of  <f  per  unit  range  of  variation  of  the 
cables 

An  important  application  of  the  Gibbs  ensemble  is  to  a dynamical 
system  in  thermodynamic  equilibrium  with  its  surroundings  at  a 
given  temperature  T.  Gibbs  showed  that  such  a system  is  repre- 
sented in  classical  mechanics  by'  the  density 

p = ce-HlkTt 


(74) 
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Jl  being  the  Hamiltonian,  which  is  now  independent  of  the  time,  k 
being  Boltzmann’s  constant,  and  c being  a number  chosen  to  make 
the  normalizing  condition  (<>(>)  hold.  This  formula  may  he  taken  over 
unchanged  into  the  quantum  theory.  At  high  temperatures,  (74) 
becomes  p ---  c,  whicli  gives,  on  being  substituted  into  the  right-hand 
side  of  (73),  c<'«f  £"/  -■  '■  in  the  case  of  discrete  |”s.  This  shows  that 
at  high  temperatures  ail  discrete  states  are  equally  probable. 


VI 

ELEMENTARY  APPLICATIONS 

34.  The  harmonic  oscillator 

A simple  and  interesting  example  of  a dynamical  system  in  quantum 
mechanics  is  the  harmonic  oscillator.  This  example  is  of  importance 
for  general  theory,  because  it  forms  a corner-stone  in  the  theory  of 
radiation.  The  dynamical  variables  needed  for  describing  the  system 
are  just  one  coordinate  q and  its  conjugate  momentum  p.  The 
Hamiltonian  in  classical  mechanics  is 

H = (i) 

where  m is  the  mass  of  the  oscillating  particle  and  w is  2tt  times  the 
frequency.  We  assume  the  same  Hamiltonian  in  quantum  mechanics. 
Tin.'.  Hamiltonian,  together  with  the  quantum  condition  (10)  of  § 22, 
define  the  system  completely. 

The  Heisenberg  equations  of  motion  are 


ft  = [ft.  H]  = pt/m, 

Vt  — [Pp  H]  — 

It  is  convenient  to  introduce  the  dimensionless 
ariable  ^ _ ( 2mh<o ) ~* (p -fi imajq) . 

The  equations  of  motion  (2)  give 


} (2) 

complex  dynamical 

(3) 


Vt  = (2wifoo)-*(— itx>pt)  = iwrjt. 

This  equation  can  be  integrated  to  give 

Vt  ~ 1?oe<tu<>  (4) 

where  Vn  is  a linear  operator  independent  of  t,  and  is  equal  to  the 

'alm>  °-  Vt  at  time  t = 0.  The  above  equations  are  all  as  in  the 

classical  theory. 

Wo  e.-m  express  q and  p in  terms  of  q and  its  conjugate  complex  rj 
itnd  may  thus  work  entirely  in  terms  of  q and  r>.  We  hove 

•h -’VV  (2m) ~l(p tntu>q)[p —imwq) 

- ( 2?n)_1[p2 4 - rn-aj2q~ -j- i rruo (qp  — - pq)~\ 

--  H- \hw 

ftwrjT)  = 

VV—VV  = L 


and  similarly 
‘Til  us 


(5) 

(6) 
(?) 
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Equation  (5)  or  (6)  gives  H in  terms  of  rt  and  ij  and  (7)  gives  the 
commutation  relation  connecting  rj  and  ij.  Froin  (5) 

ftoui ^rjyj  — r)H  — 4 hcorj 
and  from  (6)  = Hfj+^hwij. 

Thus  rjH—Hij  = (8) 

Also,  (7)  leads  to  f/if  — rTV  = nVn~1  ^ 

for  any  positive  integer  n,  as  may  be  verified  by  induction,  since.,  by 
multiplying  (9)  by  v on  the  left,  we  can  deduce  (9)  with  n+l  for  n. 
Let  H'  be  an  eigenvalue  of  H and  \H '>  an  eigenket  belonging  to  it. 

From  (5) 

fca<2r toijlfl')  = SH'\H—bh<*>\H'y  = (H' 

Now  (H'\w\H'>  is  the  square  of  the  length  of  the  ket  ij\ H'>,  and 
hence  > °, 

the  case  of  equality  occurring  only  if  f,\ H'y  = 0.  Also  > °- 

Thus  H'  (19) 

the  case  of  equality  occurring  only  if  rj\H")  = 0.  From  the  form  (1) 
of  1?  as  a sum  of  squares,  we  should  expect  its  eigenvalues  to  be  all 
positive  or  zero  (since  the  average  value  of  H for  any  state  must  be 
positive  or  zero).  We  now  have  the  more  stringent  condition  i SO). 
From  (8) 

Hrj\H'y  = (fjH—hwrj)\H'>  — (H’~~Kai)rj\H').  (11) 

Now  if  H'  # is  not  zero  and  is  then  according  to  (11)  an 

eigenket  of  H belonging  to  the  eigenvalue  H'-Kw.  Thus,  with  IV 
any  eigenvalue  of  H not  equal  to  H'—hw  is  another  eigenvalue 
of  H.  We  can  repeat  the  argument  and  infer  that,  if  H'—huj  # \hw, 
H'  — "2hu)  is  another  eigenvalue  of  H.  Continuing  in  this  way.  we 
obtain  the  series  of  eigenvalues  H ~2fioj,  H'—Zhw,..., 

which  cannot  extend  to  infinity,  because  then  it  would  contain  eigen- 
values contradicting  (10),  and  can  terminate  only  with  the  value  \hw. 
Again,  from  the  conjugate  complex  of  equation  (8) 

Hv\H'y  = (r]H+hcrn)\H'>  = (H'+hw)r,\H'>, 
showmg  that  H'+K<»  is  another  eigenvalue  of  H,  with  *?!#'>  as  an 
eigenket  belonging  to  it,  unless  r)\H')  = 0.  The  latter  alternative 
can  be  ruled  out,  since  it  would  lead  to 

o = }\wy  = {H+\tuo)\H'y  = {H'+\?uo)\H'y, 
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which  contradicts  (10).  Thus  H'+hu>  is  always  another  eigenvalue 
of  II,  and  so  are  //'  + 2ha>,  H'~\-‘iho)  and  so  on.  Hence  the  eigenvalues 
of  H are  the  series  of  numbers 

J/iw,  | hat,  (12) 

extending  to  infinity.  These  are  the  possible  energy  values  for  the 
harmonic  oscillator. 

Let  jO)  be  an  eigenket  of  H belonging  to  the  lowest  eigenvalue 
l ft w.  so  that  j)io>  = 0,  (13) 

and  form  the  sequence  of  kets 

|o>,  i?|o>,  r,*;o>,  ^jo>,  ....  in) 

These  kets  are  all  eigenkets  of  H,  belonging  to  the  sequence  of  eigen- 
values (12)  respectively.  From  (9)  and  (13) 

7jijB|0>  = (15) 

for  any  non-negative  integer  n.  Thus  the  set  of  kets  ( 1 4)  is  such  that 
t)  or  rj  applied  tc  any  one  of  the  set  gives  a ket  dependent  on  1 he  set 
Now  all  the  dynamical  variables  in  our  problem  are  express!  bit*  in  terms 
of  rj  and  fj,  so  the  kets  (14)  must  form  a complete  set  (otherwise  there 
would  be  some  more  dynamical  variables).  There  is  just  one  of  these 
kets  for  each  eigenvalue  (12)  of  H,  so  FI  by  itself  forms  a complete 
commuting  set  of  observables.  The  kets  (14)  correspond  to  the  \ acinus 
stationary  states  of  the  oscillator.  The  stationary  state  with  energy 
(m  4 h)fioj.  corresponding  to  p“|0>,  is  called  the  nt  h quantum  state. 

The  square  of  the  length  of  the  ket  rf  !0>  is 

<0!^n7j’li0>  =#;  n<0|r;n“L;,!  !i0> 
with  the  help  of  (15).  By  induction,  wo  find  that 

<0!p'bJre|0>  = n\  (16) 

provided  |0)  is  normalized.  Thus  the  kets  (14)  multiplied  by.  the 
coefficients  n\  -l  with  n = 0, 1,2,...,  respectively  form  the  basic  kets 
of  a representat  ion  , namely  the  representation  with  11  diagonal.  Any 
ket,  |.r)  can  be  expanded  in  the  form 

CO 

I *>  ~-=  2z„7?"|o>,  (if) 

0 

where  the  xn’s  are  numbers.  In  this  way  the  ket  lr>  is  put  into 
correspondence  with  a power  series  J £ni)n  in  the  variable  r/,  the 
various  terms  in  the  power  series  corresponding  to  the  various 
stationary  states.  If  |.c>  is  normalized,  it  defines  a state  for  which 
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the  probability  of  the  oscillator  being  in  the  nth  quantum  state, 
i.e.  the  probability  of  H having  the  value  («  -;  is 

/'.  . * <18) 
as  follows  from  the  same  argument  which  led  to  (51)  of  § 18. 

We  may  consider  the  ket  |0>  as  a standard  ket  and  the  power  series 
in  rj  as  a wave  function,  since  any  ket  can  be  expressed  as  such  a 
wave  function  multiplied  into  tins  standard  ket.  We  get  a kind  of 
wave  function  differing  from  the  usual  kind,  introduced  by  equations 
(62)  of  §20,  in  that  it  is  a function  of  the  complex  dynamical  variable 
77  instead  of  observables.  It  was  first  introduced  by  \ . lock,  so  we 
shall  call  the  representation  Fock’s  representation.  It  is  for  many 
purposes  the  most  convenient  representation  for  describing  states  of 
the  harmonic  oscillator.  The  standard  ket  |0)>  satisfies  the  condition 
(13),  which  replaces  the  conditions  (43)  of  § 22  for  the  standard  ket 
in  SchrOdinger’s  representation. 

Let  us  introduce  Schrodinger’s  representation  with  q diagonal  and 
obtain  the  representatives  of  the  stationary  states,  krom  (13)  and  (3) 

(p—imojq)  10)  = 0, 
so  (q‘  \p-~imwq\0)  = 0. 

With  the  help  of  (45)  of  § 22,  this  gives 

h~-Aq'\Qy+ma>q'(q'\0)  = 0. 

dq 

The  solution  of  this  differential  equation  is 

<</'  |0>  = ( mwjnfiYe -"W’W, 

the  numerical  coefficient  being  chosen  so  as  to  make  |0>  normalized. 
We  have  here  the  representative  of  the  normal  state , as  the  state  of 
lowest  energy  is  called.  The  representatives  of  the  other  stationaiy 
states  can  be  obtained  from  it.  We  have  from  (3) 

<9'h"l°>  = (2mhw)-',,l2(q'\{p+imwq)n\0y 

(d  \ « 

- h dq'  ^ mo,q"j 

= in(2mhaj)-nl^mulnh)'l--h~-  + mwq'^  e~mw'l‘*i2n.  (21) 

This  may  easily  be  worked  out  for  small  values  of'n.  The  result  is  ot 
the  form  of  e~mu,q‘ll2h  times  a pc"rer  series  of  degree  n in  q . A further 
factor  »H  must  be  inserted  in  (21)  to  get  the  normalized  representa- 
tive of  the  nth  quantum  state.  The  phase  factor  i“  may  be  discarded. 


(Ill) 

(20) 
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35.  Angular  momentum 

Let  us  consider  a particle  described  by  the  three  Cartesian  coordi- 
nates x,  y,  z and  their  conjugate  momenta  px,  py,  pz.  Its  angular 
momentum  about  the  origin  is  defined  as  in  the  classical  theory,  by 

mx  = ypz—zpv  mu  = zPx~xPz  mz  = zpu—yPx’  (22) 
or  bv  the  vector  equation 

m = x p. 

We  must  evaluate  the  I'.B.s  of  the  angular  momentum  components 
with  the  dynamical  variables  x.  px,  etc.,  and  with  each  other.  This 
we  can  do  most  conveniently  with  the  help  of  the  laws  (4)  and  (5)  of 
§ 21, thus 

[mz,x]  = \xpy--ypx,x)  = -y[px,x\  = y, 

[mz,y]  = [xp„-ypx,y]  = *{pv,y]  = -*< 

|>„z]  = \xpu-ypx,z]  = 0,  (24) 

and  similarly, 

[mz,px]  = pv,  [mZiPv\  = ~ (26) 

[mz,pz]  = 0,  (26) 

with  corresponding  relations  for  mx  and  rny.  Again 
[>V  rn;]  -=  [2px-.r/T,  w,]  - z[>x,  «,]-[*, 

= -ZPy  + yPz  = 

[m2,  wxj  = mv,  K rn„]  = w.. 

These  results  are  all  the  same  as  in  the  classical  theory.  The  sign  in 
the  results  (23),  (25),  and  (27)  may  easily  be  remembered  from  the  f 
rule  that  the  + sign  occurs  when  the  three  dynamical  variables,  con- 
sisting of  the  two  in  the  P.B.  on  the  left-hand  side  and  the  one 
forming  the  result  on  the  right,  are  in  the  cyclic  order  (xyz)  and  the 
- sign  occurs  otherwise.  Equations  (27)  may  be  put  in  the  vector 

form  m x m = ihm.  (28) 

Now  suppose  we  have  several  particles  with  angular  momenta 
m„m2,....  Each  of  these  angular  momentum  vectors  will  satisfy 

(28),  thus  mrx  mr  = ihmr, 

and  any  one  of  them  will  commute  with  any  other,  so  that 

mr  x ms -f  ms  x mr  = 0 {r  =/=  s). 
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Hence  if  M = ]£  mr  is  the  total  angular  momentum, 

r 

MxM  = T m,xms  = 2 mrxmr-r  2 (mrxms+msxmr) 

t»  r r<s 

= ih  £ mr  = t«M.  <29) 

r 

This  result  is  of  the  same  form  as  (‘28),  so  that  the  components  of  the 
total  angular  momentum  M of  any  number  of  particles  satisfy  the 
same  commutation  relations  as  those  of  the  angular  momentum  of 
a single  particle. 

Let  Ax,  A Az  denote  the  three  coordinates  of  any  one  of  the 
particles*  or  "else  the  three  components  of  momentum  of  one  of 
the  particles.  The  A' s will  commute  with  the  angular  momenta  of 
the  other  particles,  and  hence  from  (23),  (24),  (25),  and  (2b) 

[Mt,Ax]~Av>  [Mz,Ay]  = —Ar,  [MZ,AZ]  =•-  o.  <W) 

If  B , B , Bz  are  a second  set  of  three  quantities  denoting  the 
coordinates  or  momentum  components  of  one  of  the  particles,  they 
will  satisfy  similar  relations  to  (30).  We  shall  then  have 

[MtyAxBx+AvBv+AzBs] 

= [Mz,  AX]BX+AX[MZ,  BX\+[MZ,  AV]BU+AV[MZ  Bv] 

— Ay  Bx-\-Ax  By  AxBy  — AyBx 

= 0. 

Thus  the  scalar  product  AXBX+Ay  By  + Az  Bz  commutes  with  Mz, 
and  similarly  with  Mx  and  My.  Introduce  the  vector  product 

AxB  = C 
or 

Ay  Bz—Az  Bv  = Gx,  Az  Bx—Ax  Bz  = Cy,  Ax  By-Ay  Bx  = Cz. 
We  have  [Mz,  Cx]  = —Ax  Bz+A=  Bx  = C„ 

-and  similarly  [MZ,CV]  = —Cx,  [iff., C„]  = 0. 

These  equations  are  again  of  the  form  (30),  with  C for  A.  We  can 
conclude  from  this  work  that  equations  of  the  form  (30)  hold  for  the 
three  components  of  any  vector  that  we  can  construct  from  our 
dynamical  variables,  and  that  any  scalar  commutes  with  M. 

We  can  introduce  linear  operators  R referring  to  rotations  about 
the  origin  in  the  same  way  in  which  we  introduced  the  linear  operators 
D in  § 25  referring  to  displacements.  Taking  a rotation  through  an 
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angle  S</>  about  the  z-axis  and  making  8tf>  infinitesimal,  we  can  obtain 
the  limit  operator  corresponding  to  (64)  of  § 25, 

lim  (i?  — 1)/8</>, 

which  we  shall  call  the  rotation  operator  about  the  z-axis  and  denote 
by  rz.  Like  the  displacement  operators.  rz  is  a pure  imaginary  linear 
operator  and  is  undetermined  to  the  extent  of  an  arbitrary  additive 
pure  imaginary  number.  Corresponding  to  (66)  of  § 25,  the  change 
in  any  dynamical  variable  v caused  by  a rotation  through  a small 
angle  8<j>  about  the  z-axis  is 


S 4>(rzv~vrz),  (31) 

to  the  first  order  in  S <f>.  Now  the  changes  produced  in  the  three 
components  Ax,  Au,  A,  of  a vector  by  a (right-handed)  rotation  8<f> 
about  the  z-axis  applied  to  all  measuring  apparatus  are  8<f>Av, 
— 8(/>Ax,  and  0 respectively,  and  any  scalar  quantity  is  unchanged  by 
the  rotation.  Equating  these  changes  to  (31),  we  find  that 

r>Ax~-Axrs  = Ay,  rzAu-Aurz  = -Ax, 
r.A.—Azrt  = 0, 


and  rz  commutes  witji  any  scalar.  Comparing  these  results  with  (30), 

we  see  that  ikrz  satisfies  the  same  commutation  relations  as  M,. 

Their  difference.  Mz—ihrz,  commutes  with  all  the  dynamical  variables 

and  mrst  therefore  be  a number.  This  number,  which  is  necessarily 

real  since  M..  and  ifir.  are  real,  may  be  made  zero  by  a suitable  choice 

of  the  arbitrary  pure  imaginary  number  that  can  be  added  to  rt.  We 

then  have  the  result  ,, 

Mz  — ihr...  (32) 


Similar  equations  hold  for  Mx  and  My.  They  are  the  analogues  of  (69) 
of  § 25.  Thus  the  total  angular  momentum,  is  connected  with  the  rota- 
tion operators  as  the  total  momentum  is  connected  with  the  displacement 
operators.  This  conclusion  is  valid  for  any  point  as  origin. 

The  above  argument  applies  to  the  angular  momentum  arising 
from  the  motion  of  particles,  defined  by  (22)  for  each  particle.  There 
is  another  kind  of  angular  momentum  occurring  in  atomic  theory, 
spin  angular  momentum.  The  former  kind  of  angular  momentum  will 
he  called  orbital  angular  momentum , to  distinguish  it.  The  spin  angu- 
lar momentum  of  a particle  should  bo  pictured  as  due  to  some  internal 
motion  of  the  particle,  so  that  it  is  associated  with  different  degrees 
of  freedom  from  those  describing  the  motion  of  the  particle  as  a whole, 


dvnamic.l  « SP“‘  '™st 

The  spin  does  not  correspond  very 


bo'  method  of  classical 

analogy  is  not  suitable  for  studying  it.  However,  we  can  build  up  a 
Zovf  of  the  spin  simply  from  the  assumption  that  the  components 
of  the  spin  ancular  momentum  are  connected  with  the  rotation  opera 
tors  in  the  same  way  as  we  had  above  for  orbital  angular  momentum, 
he.  equation  (32)  holds  with  Mz  as  the  z component  of  the  spin  angulai 
momentum  of  a particle  and  r,  as  the  rotation  operator  about 
z-axis  referring  to  states  of  spin  of  that  particle.  W ith  this  assume 
Ton  the  commutation  relations  connecting  the  components  of  the 
spin  angular  momentum  M with  any  vector  A referring  to  the  spm 
must  he  of  the  standard  form  (30),  and  hence,  taking  A to  be  the 
spin  angular  momentum  itself,  we  have  equation  (-•>)  110  < »'ig  a s 
ft  the  spin.  We  now  have  (29)  holding  quite  generally,  or  any  sum 

of  Ep,„  avid  orbital  angular , »<1  -» (*» >; 
for  M the  total  spin  and  orbital  angular  momentum  and  A any  ector 
dynamical  variable,  and  the  connexion  between  angular  momentum 

and  rotation  operators  will  be  always  valid. 

I an  immediate  consequence  of  this  connexion,  we  can  deduce  he 
,mr  of  eon  serration  of  angular  momentum.  For  an  isolated  system , the 
Hamiltonian  must  be  unchanged  by  any  rotation  about  the  ongm  m 
oltm  worils  it  must  he  a scalar,  so  it  must  commute  with  the  angular 
momentum  about  the  origin.  Thus  the  angular  moinentmn  m a 
constant  of  the  motion.  For  this  argument  the  origin  may  be.  any 

P°Asa  second  immediate  consequence,  we  can  deduce  that  a slut* 
with  zero  total  angular  momentum  is  spherically  eymuo  t>  »«*  • • 
will  correspond  to  a ket  vS>»  say>  satisfying 


Mx\sy  -----  mu\h>  - K\s>  --  °’ 
r .]$)  — = rz\Sy  = 0. 


(33) 


and  hence 

This  shows  that  the  ket  j£>  fa  ^.altered  by  infinitesimal  rotations 
and  it  must  therefore  be  unaltered  by  finite  rotations,  since  the  lath 
‘can  be  built  up  from  infinitesimal  ones.  Tims  the  state  ^nc^y 
symmetrical.  The  result  may  be  understood  in  this  way:  if  a state  has 
zero  total  angular  momentum,  the  dynamical  system  is  equally  like  > 
to  have  any  orientation,  and  hence  spherical  symmetry  occms. 
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analogous  to  stating  that  if  a state  has  zero  total  linear  momentum, 
the  system  is  equally  likely  to  be  anywhere  in  space. 

The  converse  result  is  also  true,  a spherically  symmetrical  state  has 
zero  total  angular  momentum.  This  is  obvious  physically,  since  angular 
momentum  is  of  the  nature  of  a vector  and,  if  it  is  not  zero,  its  existence 
must  destroy  the  spherical  symmetry. 

It  should  be  noted  that  in  (33)  we  have  a ket  j£>  that  is  a simul- 
taneous eigenket  for  non-commuting  observables.  This  is  usually  not 
possible,  but  it  is  possible  in  the  present  special  case,  because  the  three 
equations  (33)  together  with  the  commutation  relations  (29)  do  not 
lead  to  any  inconsistency. 


36.  Properties  of  angular  momentum 

There  are  some  general  properties  of  angular  momentum,  deducible 
simply  from  the  commutation  relations  between  the  three  compo- 
nents. I hese  properties  must  hold  equally  for  spin  and  orbital  angular 
momentum.  Let  mx,  my,  mz  be  the  three  components  of  an  angular 
momentum,  and  introduce  the  quantity  p defined  by 

P ~ K+™  l+ml 

Since  p is  a scalar  it  must  commute  with  mx,  my,  and  rnz.  Let  us 
suppose  we  have  a dynamical  system  for  which  mx,  m mz  are  the 
only  dynamical  variables.  Then  p commutes  with  everything  and 
must  be  a number.  We  can  study  this  dynamical  system  on  much 
the  same  lines  as  we  used  for  the  harmonic  oscillator  in  § 34. 

Put  mx—imv  = r). 

From  the  commutation  relations  (27)  we  get 

VV  = (™x+imy)(mx~imy)  = m*+inj-*(ms[m1,- 


and  similarly 

Thus 

Also 


mx  V' 


VV- 


P—ml+hmz 
V rj  — P—m\—hmt. 
-rjij  = 2 hmz. 

- ihmv—hmx  = ~hrj. 


-mymx) 


(34) 

(35) 

(36) 

(37) 


We  assume  that  the  components  of  an  angular  momentum  are 
observables  and  thus  mz  has  eigenvalues.  Let  mz  be  one  of  them, 
and  | m'>  an  eigenket  belonging  to  it.  From  (34) 

— \mz \P  - -f- hmz j m'z)  ==  (p-~mz2+hm',)(mz\mz). 
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The  left-hand  side  here  is  the  square  of  the  length  of  the  ket  « m', 
and  is  thus  greater  than  or  equal  to  zero,  the  case  of  equality  occur- 
ring if  and  only  if  vjmz)  = 0.  Hence 


fi— m'e2+hmz  :>  0, 


p+$h2  > K-p) 2. 
£+p2  > 0. 


Defining  the  number  A by 

k-\-\h  — (/3-)~p2)i  = (m2+m|+m| 
so  that  A > — p,  the  inequality  (38)  becomes 


P2)*, 


(38) 


(39) 


A+P  ^ | \m'z — Pj 

or  *+*  > m'z  > —A.  (40) 

An  equality  occurs  if  and  only  if  = 0.  Similarly  from  (35) 

«\vv\<>  = (P-K*-}im'zKK\™'z>, 

showing  that  $-m'2-hm'z  > 0 

or  A ^ m'z  — A—A, 

with  an  equality  occurring  if  and  only  if  fj\m’z > ==  0.  This  result 
combined  with  (40)  shows  that  k > 0 and 

k >mz^  —k,  (41) 

with  m'z  = fcif  ?jK>  = o and  m'z  = —k  if  7?| w;>  = o. 

From  (37) 

m2^|w;>  = {■nml-nr,)\m'zy  = (m'z-k)r,\mzy. 

Now  if  mz  ^ —A,  ij|tos>  is  not  zero  and  is  then  an  eigenket  of  m. 
belonging  to  the  eigenvalue  m'z~h.  Similarly,  iim'z~h  ^ -A,  mz~2h 
is  another  eigenvalue  of  mz.  and  so  on.  We  get  in  this  way’ a series 
of  eigenvalues  mz,  mz—h,  m'z—2h,...,  which  must  terminate  from  (41), 
and  can  terminate  only  with  the  value  —A.  Again,  from  the  conjugate 
complex  of  equation  (37) 


mzV\™'z>  = (vmz-h?iv)\m'z>  = (mz+h)ij\nizy, 
showing  that  is  another  eigenvalue  of  to.  unless  rj\n£y  — 0,  in 

which  case  m'z  ==  A.  Continuing  in  this  way  we  get  a series  of  eigen- 
values mz,  mz±h,  m'z-\-2h,...,  which  must  terminate  from  (41),  and 
can  terminate  only  with  the  value  A.  We  can  conclude  that  2Ais  an 
integral  multiple  of  h and  that  the  eigenvalues  of  tu  are 

A,  A— A,  A— 2A,  — A+A.  —A. 


(42) 
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S he  eigenvalues  of  mx  and  my  are  the  same,  from  symmetry.  These 
eigenvalues  are  all  integral  or  half  odd  integral  multiples  o.t  h,  accord- 
ing to  whether  2 k is  an  even  or  odd  multiple  of  h. 

Let  • max)  be  an  eigenket  of  rnz  belonging  to  the  maximum  eigen- 
value k,  so  that 

j max)  = 0,  (43) 

and  form  the  sequence  of  kets 


jmax),  r/jinax;,  ij2|max),  ij^'jmax).  (44) 

These  kets  are  all  eigen  kets  of  m2.  belonging  to  the  sequence  of  eigen- 
values (42)  respectively.  The  set  of  kets  (44)  is  such  that  the  operator 
•q  applied  to  any  one  of  them  gives  a,  ket  dependent  on  the  set  (rj 
applied  to  the  last  gives  zero),  and  from  (36)  and  (43)  one  sees 
that  fj  applied  to  any  one  of  the  set  also  gives  a ket  dependent  on  the 
set.  All  the  dynamical  variables  for  the  system  we  are  now  dealing 
with  are  expressible  in  terms  of  v and  ij,  so  the  set  of  kets  (44)  is  a 
complete  set.  I here  is  just  one  ol  these  kets  for  each  eigenvalue  (42) 
of  mz,  so  mB  by  itself  forms  a complete  commuting  set  of  observables. 

It  is  convenient  to  define  the  magnitude  of  the  angular  momentum 
vector  m to  be  k,  given  by  (3D),  rather  than  /3b  because  the  possible 

values  for  k are  „ , . , ,, 

0.  In,  n.  ih,  2 h,  ...,  (45) 

extending  to  infinity,  while  the  possible  values  for  ft  are  a more 
complicated  set.  o*  numbers. 

I-  or  a dynamical  system  involving  ot  her  dynamical  variables  besides 
ma,  and  m..  there  may  be  variables  that  do  not  commute  with  0. 
tee  ,-i  is  no  longer  a number,  but  a general  linear  operator.  This 
i,.  orbital  angular  momentum  (22),  as  x.  y.  z,  px,  p and 
!’-  * Hb  8.  We  shall  assume  that  8 is  always  an 
rau  then  be  debited  by  (39)  with  the  positive  square 
1'  :t  also  an  observable.  Me  shall  call  k so  defined 
01  !ll°  angmar  momentum  vector  m in  the  general 
w analysis  by  which  we  obtained  the  eigenvalues  of 
1 v -vc  replace  .>»')  by  a simultaneous  eigenket  j k'mz) 
,u«  observables  k and  mz.  and  leads  to  the  result  that 
alues  lor  k are  the  numbers  (45),  and  for  each 
u <1  ute  eigenvalues  ol  m.  are  the  numbers  (42)  with  k' 

■ y.  W e have  here  art  example  of  a phenomenon  which 

■ t with  previously,  namely  that  with  two  commuting 
*<•'  eigenvalues  of  one  depend  on  what  eigenvalue  we 


tor 
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assign  to  the  other.  This  phenomenon  may  be  understood  as  the  two 
observables  being  not  altogether  independent,  but  partially  functions 
of  one  another.  The,  number  of  independent  simultaneous  eigenkets 
of  k and  mz  belonging  to  the  e-'genvalues  k'  and  m’z  must  be  indepen- 
dent of  m',  since  for  each  ndependent  \k'm'zy  we  can  obtain  an 
independent  for  any  mz  in  the  sequence  (42),  by  multiplying 

jjfc'm.')  by  a suitable  power  of  tj  or  r,. 

As  an  example  let  us  consider  a dynamical  system  with  two  angular 
momenta  mi  and  m2,  which  commute  with  one  another.  If  there  are 
no  other  dynamical  variables,  then  all  the  dynamical  variables  com- 
mute with  the  magnitudes  kl  and  k2  of  mx  and  m2,  so  kY  and  k2  are 
numbers.  However,  the  magnitude  K of  the  resultant  angular 
momentum  M --=  m1+m,  is  not  a number  (it  does  not  commute 
with  the  components  of  mx  and  m2)  and  it  is  interesting  to  work  out 
the  eigenvalues  of  K.  This  can  be  done  most  simply  by  a method 
of  counting  independent  kets.  There  is  one  independent  simultaneous 
eigenket  of  ml2  and  mu  belonging  to  any  eigenvalue  m'u  having  one  of 
the  values  kv  kY-h,  k,-2 -kx  and  any  eigenvalue  m2z  having  one 
of  the  values  k2,  and  this  ket  is  an  eigenket 

of  Mt  belonging  to  the  eigenvalue  M'z  — %.+%.  The  possible 
values  of  M'z  are  thus  k1+k2,k1  + k2-n,k1^k2--2h>...~-k1-k2,  and 
the  number  of  times  each  of  them  occurs  is  given  by  the  following 
scheme  (if  we  assume  for  definiteness  that  kY  > k2), 


k1+k2,k1-\-k2—h,k1+k2—2h,...,k1  — k2,k1  k2  h,... 

1 2 3 ...  2k2+l  2k2+l  ... 

...  — kY~\~k2,  kY~^~k2  fi,...,' 

...  2 fc2  + l 2 k2  ... 


- kY  k2 
1 


(46) 


Now  each  eigenvalue  K’  of  K will  be  associated  with  the  eigenvalues 
K'—h,  A"  — 2ft,...,  —K'  for  Mz,  with  the  same  number  of  indepen- 
dent simultaneous  eigenkets  of  K and  M,  for  each  of  them.  The  total 
number  of  independent  eigenkets  of  Mz  belonging  to  any  eigenvalue 
M't  must  be  the  same,  whether  we  take  them  to  be  simultaneous 
eigenkets  of  mlz  and  m2,  or  simultaneous  eigenkets  of  K and  M,,  i.e. 
it  is  always  given  by  the  scheme  (46).  It  follows  that  the  eigenvalues 

for  K are 

kY4-k2,  kY+k2—fi,  kt+k2— 2ft,  ...,  kY—k2,  (4  ) 
and  that  for  each  of  these  eigenvalues  for  K and  an  eigenvalue  for 


US 
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X going  With  it  there  is  just  one  independent  simultaneous  eigenket 
of  K and  Mz. 

1 he  effect  of  rotations  on  eigenkets  of  angular  momentum  variables 
should  be  noted.  Take  any  eigenket  | Af;>  of  the  2 component  of  total 
angular  momentum  for  any  dynamical  system,  and  apply  to  it  a small 
rotation  through  an  angle  8 about  the  2-axis.  It  will  change  into 

(i+s*r.)|jr;>  = 

with  the  help  of  (32).  This  equals 

( l~i8<f>M'z/h)\Mz > = 

to  the  first  order  in  8<f>.  Thus  \M't>  gets  multiplied  by  the  numerical 
factor  e 1 W*.  By  applying  a succession  of  these  small  rotations,  we 
find  that  the  application  of  a finite  rotation  through  an  angle  <f>  about 
the  2-axis  causes  \M'e)>  to  get  multiplied  by  Putting  <f>  = 2-n 

we  find  that  an  application  of  one  revolution  about  the  2-axis  leaves 
\Msy  unchanged  if  the  eigenvalue  X'  is  an  integral  multiple  of  h and 
causes  \M'Z)  to  change  sign  if  M'  is  half  an  odd  integral  multiple  of*. 
Now  consider  an  eigenket  |A'>  of  the  magnitude  K of  the  total  angu- 
lar momentum.  If  the  eigenvalue  K’  is  an  integral  multiple  of  h,  the 
possible  eigenvalues  of  Mz  are  all  integral  multiples  of/t  and  the  applica- 
tion of  one  revolution  about  the  z-axis  must  leave  | K’>  unchanged. 
Conversely , if  K'  is  half  an  odd  integral  multiple  of  K,  the  possible  eigen  - 
values  of  Ms  are  all  half  odd  integral  multiples  of  h and  the  revolution 
must  change  the  sign  of  |A'>.  From  symmetry,  the  application  of  a 
revolution  about  any  other  axis  must  have  the  same  effect  on  |A"> 
as  one  about  the  2-axis.  We  thus  get  the  general  result,  the  application 
of  one  revolution  about  any  axis  leaves  a ket  unchanged  or  changes  its 
sign  according  to  uihether  it  belongs  to  eigenvalues  of  the  magnitude  of 
the  total  angular  momentum  which  are  integral  or  half  odd  integral 
multiples  of  h.  k state,  of  course,  is  always  unaffected  by  the  revolu- 
tion, since  a state  is  unaffected  by  a change  of  sign  of  the  ket  corre- 
sponding to  it. 

1 or  a dynamical  system  involving  only  orbital  angular  momenta, 
a ket  must  be  unchanged  by  a revolution  about  an  axis,  since  we  can 
set  up  Sehrodinger  s representation,  with  the  coordinates  of  all  the 
particles  diagonal,  and  the  Sehrodinger  representative  of  a ket  will 
get  brought  back  to  its  original  value  by  the  revolution.  It  follows 
that  the  eigenvalues  of  the  magnitude  of  an  orbital  angular  momentum 
are  always  integral  multiples  of  h.  The  eigenvalues  of  a component 
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of  an  orbital  angular  momentum  are  also  always  integral  multiples 
of  ft.  For  a spin  angular  momentum,  Schrodinger’s  representation 
does  not  exist  and  both  kinds  of  eigenvalue  are  possible. 


37.  The  spin  of  the  electron 

Electrons,  and  also  some  of  the  other  fundamental  particles  (pro- 
tons, neutrons)  have  a spin  whose  magnitude  is  \ft.  This  is  found 
from  experimental  evidence,  and  also  there  are  theoretical  reasons 
showing  that  this  spin  value  is  more  elementary  than  any  other,  even 
spin  zero  (see  Chapter  XI).  The  study  of  this  particular  spin  is  there- 
fore of  special  importance. 

For  dealing  with  an  angular  momentum  m whose  magnitude  is  \ft, 

it  is  convenient  to  put  , r ,,0, 

r m ftfta.  (48) 


The  components  of  the  vector  a then  satisfy,  from  (27), 

<xvoe-<Tzav  = 2 iarx  f 

ozox—oxoz  = 2ioy,  |(49) 

axau~CJvax  = ‘2ioz-  ) 

The  eigenvalues  of  mz  are  \ft  and  — \ft,  so  the  eigenvalues  of  az  are  1 
and  -—1,  and  a\  has  just  the  one  eigenvalue  1 . It  follows  that  a\  must 
equal  1,  and  similarly  for  a%  and  of,  i.e. 

o»  = o*  = o*=l.  (50) 

We  can  get  equations  (49)  and  (50)  into  a simpler  form  by  means  of 
some  straightforward  non-commutative  algebra.  From  (50) 

°laz  — °z°l  = 0 

or  av^v  az~az  ay)  + (ay  uz  — °z  °y)ay  = 0 


or  <jyox+oxcry  — 0 

with  the  help  of  the  first  of  equations  (49).  This  means  ox  ay  = -—uyax. 
Two  dynamical  variables  or  linear  operators  like  these  which  satisfy 
the  commutative  law  of  multiplication  except  for  a minus  sign  will 
be  said  to  anticommute.  Thus  ax  anticommutes  with  ay.  From  sym- 
metry each  of  the  three  dynamical  variables  ax,  oy,  a,  must  anti- 
commute with  any  other.  Equations  (49)  may  now  be  written 


and  also  from  (50) 

369S.57 


ayaz  = lax  = —azay, 
azax  = = ax°z> 

- Gy  Gxy 


(52) 
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Equations  (50),  (51),  (52)  are  the  fundamental  equations  satisfied  by 
the  spin  variables  a describing  a spin  whose  magnitude  is  \fi. 

Let  us  set  up  a matrix  representation  for  the  a’s  and  let  us  take  az 
to  be  diagonal.  If  there  are  no  other  independent  dynamical  variables 
besides  the  ra’s  or  a’ s in  our  dynamical  system,  then  az  by  itself  forms 
a complete  set  of  commuting  observables,  since  the  form  of  equations 
(50)  and  (51)  is  such  that  we  cannot  construct  out  of  ax,  ay,  and  az 
any  new  dynamical  variable  that  commutes  with  az.  The  diagonal 
elements  of  the  matrix  representing  az  being  the  eigenvalues  1 and 
— 1 of  a,,  the  matrix  itself  will  be 


Let  crx  be  represented  by 


(i 

(a  1 

l«3 


4 

aa\ 

aJ 


This  matrix  must  be  Hermitian,  so  that  a1  and  a4  must  be  real  and 
a2  and  a3  conjugate  complex  numbers.  The  equation  azax  = —axaz 

giVe8US  l «.)/«* 

\-a3  -aJ  \a3  -aJ’ 

so  that  aj  = a4  — 0.  Hence  ax  is  represented  by  a matrix  of  the  form 


10  a2\ 

l«3  0 /' 


The  equation  a%  = 1 now  shows  that  a2a3  = 1 . Thus  a2  and  a3,  being 
conjugate  complex  numbers,  must  be  of  the  form  eia  and  e_ia  re- 
spectively, where  a is  a real  number,  so  that  ux  is  represented  by  a 

matrix  of  the  form  , „ 

/ 0 e,a\ 

\e_i“  0 /' 

Similarly  it  may  be  shown  that  <jv  is  also  represented  by  a matrix  of 
this  form.  By  suitably  choosing  the  phase  factors  in  the  representa- 
tion, which  is  not  completely  determined  by  the  condition  that  az 
shall  be  diagonal,  we  can  arrange  that  ax  shall  be  represented  by  the 
matrix  ,q  j, 

(1  o)- 

The  representative  of  ay  is  then  determined  by  the  equation 
Oy  = iox  (7Z.  We  thus  obtain  finally  the  three  matrices 

t°  >\  i°  -n  ii  °\ 
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to  represent  o^,  ay,  and  az  respectively,  which  matrices  satisfy  all  the 
algebraic  relations  (49),  (50),  (51),  (52).  The  component  of  the  vector 
9 in  an  arbitrary  direction  specified  by  the  direction  cosines  l,  m,  n, 
namely  lax-\-moy-\-naz,  is  represented  by 

/ n l~im\  (54) 

\l-\-im  —7*  / 

The  representative  of  a ket  vector  will  consist  of  just  two  numbers, 
corresponding  to  the  two  values  +1  and  —1  for  a'z.  These  two  num- 
bers form  a function  of  the  variable  a't  whose  domain  consists  of  only 
the  two  points  -f 1 and  — 1 . The  state  for  which  az  has  the  value  unity 
will  be  represented  by  the  function,  fx(<rz)  say,  consisting  of  the  pair 
of  numbers  1,  0 and  that  for  which  at  has  the  value  —1  will  be 
represented  by  the  function,  fp(cr'e)  say,  consisting  of  the  pair  0,  1. 
Any  function  of  the  variable  az,  i.e.  any  pair  of  numbers,  can  be 
expressed  as  a linear  combination  of  these  tw  o.  Thus  any  state  can 
be  obtained  by  superposition  of  the  two  states  for  which  a2  equals  +1  and 
— 1 respectively.  For  example,  the  state  for  which  the  component  of 
o in  the  direction  l,  m,  n,  represented  by  (54),  has  the  value  +1  is 
represented  by  the  pair  of  numbers  a,  b which  satisfy 
I n l—im\la\  _ la\ 

\l+im  — n J\bj  \bj 

or  na-\-{l—im)b  — a. 


Thus 


(l+im)a-~nb  — b. 

a l—im  _ 1+n 

6 l—7i  l+im  ’ 


This  state  can  be  regarded  as  a superposition  of  the  two  states  for 
which  oz  equals  -f  1 and  —1,  the  relative  weights  in  the  superposition 
process  being  as 

|a|2  : |6 |a  = \l—im\2 : (1  -n)2  = 1+n  : 1-w.  (55) 

For  the  complete  description  of  an  electron  (or  other  elementary 
particle  with  spin  \h)  we  require  the  spin  dynamical  variables  a, 
whose  connexion  with  the  spin  angular  momentum  is  given  by  (48), 
together  with  the  Cartesian  coordinates  x,  y,  z and  momenta  px,  pv, 
pz.  The  spin  dynamical  variables  commute  with  these  coordinates 
and  momenta.  Thus  a complete  set  of  commuting  observables  for  a 
system  consisting  of  a single  electron  will  be  x,  y,  z,  crz.  In  a repre- 
sentation in  which  these  are  diagonal,  the  representative  of  any  state 
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will  be  a function  of  four  variables  x',  y\  z',  a'z.  Since  a'  has  a domain 
consisting  of  only  two  points,  namely  1 and  — 1,  this  function  of  four 
variables  is  the'same  as  two  functions  of  three  variables,  namely  the 
two  functions 

ix'y'z' |>+  = <x',2/',z',+  l|>,  ix'y'z’ |>_  = <x', y',z',-l\>.  (56) 
Thus  the  presence  of  the  spin  may  be  considered  either  as  introducing  a 
new  variable  into  the  representative  of  a state  or  as  giving  this  representa- 
tive two  components. 

38.  Motion  in  a central  field  of  force 

An  atom  consists  of  a massive  positively  charged  nucleus  together 
with  a number  of  electrons  moving  round,  under  the  influence  of  the 
attractive  force  of  the  nucleus  and  their  own  mutual  repulsions.  An 
exact  treatment  of  this  dynamical  system  is  a very  difficult  mathe- 
matical problem.  One  can,  however,  gain  some  insight  into  the  main 
features  of  the  system  by  making  the  rough  approximation  of  regard- 
ing each  electron  as  moving  independently  in  a certain  central  field 
of  force,  namely  that  of  the  nucleus,  assumed  fixed,  together  with 
some  kind  of  average  of  the  forces  due  to  the  other  electrons.  Thus 
our  present  problem  of  the  motion  of  a particle  in  a central  field  of 
force  forms  a corner-stone  in  the  theory  of  the  atom. 

Let  the  Cartesian  coordinates  of  the  particle,  referred  to  a system 
of  axes  with  the  centre  of  force  as  origin,  be  x,  y,  z and  the  corre- 
sponding components  of  momentum  px,  py,  pz.  The  Hamiltonian, 
with  neglect  of  relativistic  mechanics,  will  be  of  the  form 

H = l/2m.(p2x+pl+p*)+V,  (57) 

where  V,  the  potential  energy,  is  a function  only  of  (a;2+ya-)-zz).  To 
develop  the  theory  it  is  convenient  to  introduce  polar  dynamical 
variables.  We  introduce  first  the  radius  r,  defined  as  the  positive 
square  root  r = 

Its  eigenvalues  go  from  0 to  oo.  If  we  evaluate  its  P.B.s  with  px,  py, 
and  pz,  we  obtain,  with  the  help  of  formula  (32)  of  § 22, 

[mvI-*.  [r.rJ-?. 

the  same  as  in  the  classical  theory.  We  introduce  also  the  dynamical 
variable  pr  defined  by 

Vr  = r-1{xpx+ypy+zp1). 
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Its  P.B.  with  r is  given  by 

A y>Pr]  = [r>  rPr]  = [r>*Px+yPv  + zPz]. 

= x[r’Px]+y[r,Pv]+z[r,pz] 

— x.zlr-\-y  .yjr+z.z/r  = r. 

Hence  [r,pr]  = 1 

or  rpr—prr  = ih. 

The  commutation  relation  between  r and  pr  is  just  the  one  for  a 
canonical  coordinate  and  momentum,  namely  equation  (10)  of  § 22. 
This  makes  pT  like  the  momentum  conjugate  to  the  r coordinate,  but 
it  is  not  exactly  equal  to  this  momentum  because  it  is  not  real,  its 
conjugate  complex  being 

Pr  = (PxX+Pvy+PzZ)*-1  = (xPx+yPy+zpz-Sifyr-1 

— ( rpT—Zih)r _1  = pr—2ifir~1.  (59) 

Thus  Pf—ihr1  is  real  and  is  the  true  momentum  conjugate  to  r. 

The  angular  momentum  m of  the  particle  about  the  origin  is  given 
by  (22)  and  its  magnitude  k is  given  by  (39).  Since  r and  pT  are 
scalars,  they  commute  with  m,  and  therefore  also  with  k. 

We  can  express  the  Hamiltonian  in  terms  of  r,  pr,  and  k.  We  have, 
if  ^ denotes  a sum  over  cyclic  permutations  of  the  suffixes  x,  y,  z, 

xyz 

k{k+h)  = 2 = 2 (xPv-yPx)2 

xyz  xyz 

= 2 (xPv  xPv+yPx  yPx-xPv  yPx-yPx  xpy) 

xyz 

= 2 (x2Pl+ytpl-xpxpvy-ypvpxx+x2pl-xpxpxx~ 
xyz 

— 2 ihxpx) 

= (x2+y2+z2)(pl+pl+p*)- 

—(xPx+yPv+zPz)(Pxx+Pvy+P*z+2ifi) 

= r*(pl+pl+p*)~  rpr{prr+2ih) 

= r*{pl+pl+pl)-rp*r. 
from  (59).  Hence 

This  form  for  ti  is  such  that  k commutes  not  only  with  H,  as  is 
necessary  since  k is  a constant  of  the  motion,  but  also  with  every 
dynamical  variable  occurring  in  H,  namely  r,  pr,  and  V,  which  is  a 
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function  of  r.  In  consequence,  a simple  treatment  becomes  possible, 
namely,  we  may  consider  an  eigenstate  of  k belonging  to  an  eigen- 
value k'  and  then  we  can  substitute  k'  for  k in  (60)  and  get  a problem 
in  one  degree  of  freedom  r. 

Let  us  introduce  Schrodinger’s  representation  with  x,  y,  z diagonal. 
Then  px,  py,  pz  are  equal  to  the  operators  —ih  d/dx,  —ih  djdy,  — ih  8/8z 
respectively.  A state  is  represented  by  a wave  function  <p(xyzt)  satis- 
fying Schrodinger’s  wave  equation  (7)  of  § 27,  which  now  reads,  with 
H given  by  (57), 

i*!M--^(-+-+-)+>'U.  (9i) 

a ( + ii‘)  lr 


We  may  pass  from  the  Cartesian  coordinates  x,y,z  to  the  polar 
coordinates  r,  9,  <f>  by  means  of  the  equations 


x = r sin  8 cos  <f>, 
y = rsinflsin^, 
z = rcos#, 


(62) 


and  may  express  the  wave  function  in  terms  pf  the  polar  coordinates, 
so  that  it  reads  ip(r8<f>t).  The  equations  (62)  give  the  operator  equation 

d 8x  8 dy  d dz  d x 8 y 8 z 8 

dr  ~ dr  dx^  dr  dy  dr  dz  r dx  r dy  r dz’ 

which  shows,  on  being  compared  with  (58),  that  pr  — —ihd/dr.  Thus 
Schrodinger’s  wave  equation  reads,  with  the  form  (60)  for  H, 


k{k+h) 

hb* 


■A- 


(63) 


Here  A:  is  a certain  linear  operator  which,  since  it  commutes  with  r 
and  8/8r,  can  involve  only  9,  <f>,  8138,  and  8/d<f>.  From  the  formula 

k{k+h)  = (64) 

which  comes  from  (39),  and  from  (62)  one  can  work  out  the  form  of 
k(k-\- h)  and  one  finds 


HM)  Lisin 9 1 LI 

ft?  sin  8 88  88  sin4#  8<f>2 


(65) 


This  operator  is  well  known  in  mathematical  physics.  Its  eigen- 
functions are  called  spherical  harmonics  and  its  eigenvalues  are 
n{n+ 1)  where  n is  an  integer.  Thus  the  theory  of  spherical  har- 
monics, provides  an  alternative  proof  that  the  eigenvalues  of  k are 
integral  multiples  of  h. 


*1 

I 

I 


! 
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For  an  eigenstate  of  k belonging  to  the  eigenvalue  nh  ( » a non- 
negative integer)  the  wave  function  will  be  of  the  form 

* = r-'3Art)Snm,  <66) 

where  Sn(6<f>)  satisfies 

k(k+h)Sn(9<j>)  = n(n+l)h2Sn(d<j>),  (67) 

i.e.  from  (65)  Sn  is  a spherical  harmonic  of  order  n.  The  factor  r 1 
is  inserted  in  (66)  for  convenience.  Substituting  (66)  into  (63),  we 
get  as  the  equation  for  x 


*£-{=(- 


£+^)+r 


— r-  , (68) 

dr ^ r*  ] 

If  the  state  is  a stationary  state  belonging  to  the  energy  value  H' , 
X will  be  of  the  form  ^ — Xo(r)e~iH  ‘lh 

and  (68)  will  reduce  to 


( ft2  / d 2 »(fl  + 1)V)_yL 

H Xo  — |2m\  dr2+  r 2 ) I 


(69) 


This  equation  may  be  used  to  determine  the  energy-levels  H of  the 
system.  For  each  solution  Xo  of  (69),  arising  from  a given  n there 
will  be  2n+l  independent  states,  because  there  are  2»+l  indepen- 
dent solutions  of  (67)  corresponding  to  the  2n+l  different  va  ues 
that  a component  of  the  angular  momentum,  mz  say,  can  take  on. 

The  probability  of  the  particle  being  in  an  element  of  volume 
dxdyaz  is  proportional  to  | ^dxdydz.  With  ^ of  the  form  (66)  this 
becomes  r-2\X?\Sn?dxdydz.  The  probability  of  the  particle  being  m 
a spherical  shell  between  r and  r+dr  is  then  proportional  to  |x|  dr 
It  now  becomes  clear  that,  in  solving  equation  (68)  or  (69),  we  mus 
impose  a boundary  condition  on  the  function  x at  r - 0,  namely  the 
function  must  be  such  that  the  integral  to  the  origin  J |X|  dr  is 

convergent.  If  this  integral  were  not  convergent,  the  wave  function 
would  represent  a state  for  which  the  chances  are  infinite  y m AV'°’*r 
of  the  particle  being  at  the  origin  and  such  a state  .would  not  be 

physically  admissible. 

The  boundary  condition  at  r = 0 obtained  by  the  above  considera- 
tion of  probabilities  is,  however,  not  sufficiently  stringent.  We  get  a 
more  stringent  condition  by  verifying  that  the  wave  function  obtained 
by  solving  the  wave  equation  in  polar  coo.  ..mates  (63)  really  satis  les 
the  wave  equation  in  Cartesian  coordinates  (61 ).  Let  us  take  the  case 


156 


ELEMENTARY  APPLICATIONS 


§ 38 


of  V — 0,  giving  us  the  problem  of  the  free  particle.  Applied  to  a 
stationary  state  with  energy  H'  = 0,  equation  (61)  gives 

VV  = 0,  (70) 

where  V2  is  written  for  the  Laplacian  operator  82jdx2  + b2jby2 + 62/Sz2, 
and  equation  (63)  gives 


II 

\r  dr2 


r 


k(k-\-h) 


> 


= 0. 


(71) 


A solution  of  (71)  for  k = 0 is  0 = r~l.  This  does  not  satisfy 
(70),  since,  although  V2?--1  vanishes  for  any  finite  value  of  r , its  integral 
through  a volume  containing  the  origin  is  — 4 tr  (as  may  be  verified 
by  transforming  this  volume  integral  to  a surface  integral  by  means 
of  Gauss’s  theorem),  and  hence 


V2?-1  = — 4w8(a;)%)8(z).  (72) 


Thus  not  every  solution  of  (71)  gives  a solution  of  (70),  and  more 
generally,  not  every  solution  of  (63)  is  a solution  of  (61).  We  must 
impose  on  the  solution  of  (63)  the  condition  that  it  shall  not  tend  to 
infinity  as  rapidly  as  r~x  when  r ->  0 in  order  that,  when  substituted 
into  (61),  it  shall  not  give  a 8 function  on  the  right  like  the  right-hand 
side  of  (72).  Only  when  equation  (63)  is  supplemented  with  this  condi- 
tion does  it  become  equivalent  to  equation  (61).  We  thus  have  the 
boundary  condition  rip  -*■  0 or  x 0 as  r ->  0. 

There  are  also  boundary  conditions  for  the  wave  function  at  r = oo. 
If  we  are  interested  only  in  ‘closed’  states,  i.e.  states  for  which  the 
particle  does  not  go  off  to  infinity,  we  must  restrict  the  integral  to 

OO 

infinity  J \x(r)\2  dr  to  be  convergent.  These  closed  states,  however, 
are  not  the  only  ones  that  are  physically  permissible,  as  we  can  also 
have  states  in  which  the  particle  arrives  from  infinity,  is  scattered 
by  the  central  field  of  force,  and  goes  off  to  infinity  again.  For  these 
states  the  wave  function  may  remain  finite  as  r -4-  oo.  Such  states  will 
be  dealt  with  in  Chapter  VIII  under  the  heading  of  collision  problems. 
In  any  case  the  wave  function  must  not  tend  to  infinity  as  r ->  oo,  or 
it  will  represent  a state  that  has  no  physical  meaning. 


39.  Energy -levels  of  the  hydrogen  atom 

The  above  analysis  may  be  applied  to  the  problem  of  the  hydrogen 
atom  with  neglect  of  relativistic  mechanics  and  the  spin  of  the 
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electron.  The  potential  energy  V is  nowf  -e2/r,  so  that  equation 
(69)  becomes 

l d 2 »(»+l)  , 2 me2  1\,#  _ _2^v„  (73) 

[W2 ^-+-¥-rjX°-  h2  Xo 

A thorough  investigation  of  this  equation  has  been  given  by  Schro- 
dinger.J  We  shall  here  obtain  its  eigenvalues  H'  by  an  elementary 

argument. 

It  is  convenient  to  put 

Xo  = f(r)e~Ta<  <74> 

introducing  the  new  function  f(r),  where  a is  one  or  other  of  the 
square  roots  a = ±,J(—h2l2mH').  (75) 


Equation  (73)  now  becomes 

id2  2 d n(n+l)  ■ 2me*  l\  f(r)  - n (76) 

\d?~aTr  r2  + h2  rjn 

We  look  for  a solution  of  this  equation  in  the  form  of  a power  series 

f{r)  = Zcsr*’  (77) 

8 

in  which  consecutive  values  for  s differ  by  unity  although  these 
values  themselves  need  not  be  integers.  On  substituting  (77)  in  (76) 


we  obtain 

^Ct{s(s-iy-2-(2slay-1-n(n+iy-2+(^me2lfi2y-1}  = 0, 

which  gives,  on  equating  to  zero  the  coefficient  of  r*~2,  the  following 
relation  between  successive  coefficients  cg, 

Cj[a(,  _!)_»(*+ 1)]  = cg_1[2(s—  l)/o— 2mea/&2].  (78), 

We  saw  in  the  preceding  section  that  only  those  eigenfunctions  x 
are  allowed  that  tend  to  zero  with  r and  hence,  from  (74),  f(r)  must 
tend  to  zero  with  r.  The  series  (77)  must  therefore  terminate  on  the 
side  of  small  s and  the  minimum  value  of  s must  be  greater  than  zero. 
Now  the  only  possible  minimum  values  of  s are  those  that  make  the 
coefficient  of  c„  in  (78)  vanish,  i.e.  n+1  and  ~n>  and  the  seco'Kl 
of  these  is  negative  or  zero.  Thus  the  minimum  value  of  « must  be 
w-f  1.  Since  n is  always  an  integer,  the  values  of  s will  all  be  integers. 


+ The  e here,  denoting  minus  the  charge  on  an  electron,  is,  of  course,  to  be  dis- 
tinguished from  the  e denoting  the  base  of  exponentials. 

| j Schrodinger,  Ann.  d.  Physik,  79  (1929),  301. 
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The  series  (77)  will  in  general  extend  to  infinity  on  the  side  of  large  s. 
For  large  values  of  s the  ratio  of  successive  terms  is 

c«  r _ 2r 
cs_i  sa 

according  to  (78).  Thus  the  series  (77)  will  always  converge,  as  the 
ratios  of  the  higher  terms  to  one  another  are  the  same  as  for  the 


which  converges  to  e2rla. 

We  must  now  examine  how  our  solution  xo  behaves  for  large 
values  of  r.  We  must  distinguish  between  the  two  cases  of  H'  positive 
and  H'  negative.  For  H‘  negative,  a given  by  (75)  will  be  real.  Sup- 
pose we  take  the  positive  value  for  a.  Then  as  r ->  oo  the  sum  of  the 
series  (77)  will  tend  to  infinity  according  to  the  same  law  as  the  sum 
of  the  series  (79),  i.e.  the  law  e2rK  Thus,  from  (74),  xo  will  tend  to 
infinity  according  to  the  law  erla  and  will  not  represent  a physically 
possible  state.  There  is  therefore  in  general  no  permissible  solution 
of  (73)  for  negative  values  of  H'.  An  exception  arises,  however,  when- 
ever the  series  (77)  terminates  on  the  side  of  large  s,  in  which  case  the 
boundary  conditions  are  all  satisfied.  The  condition  for  this  termina- 
tion of  the  series  is  that  the  coefficient  of  cg_1  in  (78)  shall  vanish  for 
some  value  of  the  suffix  s — 1 not  less  than  its  minimum  value  w-f-1, 
which  is  the  same  as  the  condition  that 


s me2 

a W 


for  some  integer  s not  less  than  n-j-1.  With  the  help  of  (75)  this 

condition  becomes  . 

TJ,  me 4 

H'  = (80) 


2 s2£2’ 


and  is  thus  a condition  for  the  energy-level  H’ . Since  s may  be  any 
positive  integer,  the  formula  (80)  gives  a discrete  set  of  negative 
energy-levels  for  the  hydrogen  atom.  These  are  in  agreement  with 
experiment.  For  each  of  them  (except  the  lowest  one  5=1)  there 
are  several  independent  states,  as  there  are  various  possible  values 
for  n,  namely  any  positive  or  zero  integer  less  than  5.  This  multi- 
plicity of  states  belonging  to  an  energy-level  is  in  addition  to  that 
mentioned  in  the  preceding  section  arising  from  the  various  possible 
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values  for  a component  of  angular  momentum,  which  latter  multi- 
plicity occurs  with  any  central  field  of  force.  The  n multiplicity  occurs 
only  with  an  inverse  square  law  of  force  and  even  then  is  removed 
when  one  takes  relativistic  mechanics  into  account,  as  will  be  found 
in  Chapter  XI.  The  solution  Xo  of  (73)  when  H'  satisfies  (80)  tends  to 
zero  exponentially  as  r oo  and  thus  represents  a closed  state  (corre- 
sponding to  an  elliptic  orbit  in  Bohr  s theory). 

For  any  positive  values  of  H ',  a given  by  (75)  will  be  pure  imaginary. 
The  series  (77),  which  is  like  the  series  (79)  for  large  r,  will  now  have  a 
sum  that  remains  finite  as  r-»co.  Thus  Xo  given  by  (74)  will  now  remain 
finite  as  r-s-oo  and  will  therefore  be  a permissible  solution  of  (73), 
giving  a wave  function  < p that  tends  to  zero  according  to  the  law  r~l  as 
r .+  oo.  Hence  in  addition  to  the  discrete  set  of  negative  energy -levels 
(80),  all  positive  energy-levels  are  allowed.  The  states  of  positive 

OO 

energy  are  not  closed,  since  for  them  the  integral  to  infinity  J Xo  dr 
does  not  converge.  (These  states  correspond  to  the  hyperbolic  orbits 
of  Bohr’s  theory.) 

40.  Selection  rules 

If  a dynamical  system  is  set  up  in  a certain  stationary  state,  it  will 
remain  in  that  stationary  state  so  long  as  it  is  not  acted  upon  by 
outside  forces.  Any  atomic  system  in  practice,  however,  frequently 
gets  acted  upon  by  external  electromagnetic  fields,  under  whose 
influence  it  is  liable  to  cease  to  be  in  one  stationary  state  and  to  make 
a transition  to  another.  The  theory  of  such  transitions  will  be  de- 
veloped in  §§  44  and  45.  A result  of  this  theory  is  that,  to  a high  degree 
of  accuracy,  transitions  between  two  states  cannot  occur  under  the 
influence  of  electromagnetic  radiation  if,  in  a Heisenberg  representa- 
tion with  these  two  stationary  states  as  two  of  the  basic  states,  the 
matrix  element,  referring  to  these  two  states,  of  the  representative 
of  the  total  electric  displacement  D of  the  system  vanishes.  Now  it 
happens  for  many  atomic  systems  that  the  great  majority  of  the 
matrix  elements  of  D in  a Heisenberg  representation  do  vanish,  and 
hence  there  are  severe  limitations  on  the  possibilities  for  transitions. 
The  rules  that  express  these  limitations  are  called  selection  rules. 

The  idea  of  selection  rules  can  be  refined  by  a more  detailed 
application  of  the  theory  of  §§44  and  45,  according  to  which 
the  matrix  elements  of  the  different  Cartesian  components  of  the 
vector  D are  associated  with  different  states  of  polarization  of  the 
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electromagnetic  radiation.  The  nature  of  this  association  is  just  what 
one  would  get  if  one  considered  the  matrix  elements,  or  rather  their 
real  parts,  as  the'  amplitudes  of  harmonic  oscillators  which  interact 
with  the  field  of  radiation  according  to  classical  electrodynamics. 

There  is  a general  method  for  obtaining  all  selection  rules,  as 
follows.  Let  us  call  the  constants  of  the  motion  which  are  diagonal  in 
the  Heisenberg  representation  a’s  and  let  D be  one  of  the  Cartesian 
components  of  D.  We  must  obtain  an  algebraic  equation  connecting 
D and  the  «’s  which  does  not  involve  any  dynamical  variables  other 
than  D and  the  a s and  which  is  linear  in  D.  Such  an  equation  will 
be  of  the  form  „ 

2frDgr  = 0,  (81) 

where  the/r’s  and  gr’a  are  functions  of  the  a’s  only.  If  this  equation 
is  expressed  in  terms  of  representatives,  it  gives  us 

2fr(<x')wm«>gr(<x’)  = 0, 

r 

which  shows  that  <a'|Z>|a">  = 0 unless 

2/r(a')9,r(°0  = 0.  (82) 

This  last  equation,  giving  the  connexion  which  must  exist  between 
a and  a"  m order  that  <«'!£> la")  may  not  vanish,  constitutes  the 
selection  rule,  so  far  as  the  component  D of  D is  concerned. 

Our  work  on  the  harmonic  oscillator  in  § 34  provides  an  example 
of  a selection  rule.  Equation  (8)  is  of  the  form  (81)  with  rj  for  D and 
H playing  the  part  of  the  a’s,  and  it  shows  that  the  matrix  elements 
<£(  \t)\H  > of  rj  all  vanish  except  those  for  which  H"—H'  = Km.  The 
conjugate  complex  of  this  result  is  that  the  matrix  elements  (H'lrj  \H") 
of  r,  afi  vanish  except  those  for  which  H"-H'  = -Km.  Since  q is  a 
numerical  multiple  of  rj—ij,  its  matrix  elements  <H'\q\H"}  aU  vanish 
except  those  for  which  H'-H'  = ±Km.  If  the  harmonic  osciUator 
carries  an  electric  charge,  its  electric  displacement  D will  be  pro- 
portional to  q.  The  selection  rule  is  then  that  only  those  transitions 
can  take  place  in  which  the  energy  H changes  by  a single  quan- 
turn  hoj. 

We  shall  now  obtain  the  selection  rules  for  mz  and  k for  an  electron 
moving  in  a central  field  of  force.  The  components  of  electric  dis- 
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placement  are  here  proportional  to  the  Cartesian  coordinates  x,  y,  z. 
Taking  first  mz,  we  have  that  mz  commutes  with  z,  or  that 

mzz—zmz  — 0. 

This  is  an  equation  of  the  required  type  (81),  giving  us  the  selection 

rule  > * n 

mz—mz  — 0 

for  the  z-component  of  the  displacement.  Again,  from  equations 
(23)  we  have  [mz,[mz,x]\  = [mz,y\  = -a; 

or  mix—  2mzxmz-\-xm%— Wx  — 0, 

which  is  also  of  the  type  (81)  and  gives  us  the  selection  rule 
m'z2—2m'zm"z+ml,zt—ht  — 0 
or  (m'z — ml — h)(m'z  m"z-\-h)  = 0 

for  the  ^-component  of  the  displacement.  The  selection  rule  for  the 
y-component  is  the  same.  Thus  our  selection  rules  for  mz  are  that 
in  transitions  associated  with  radiation  with  a polarization  corresponding 
to  an  electric  dipole  in  the  z-direction,  mz  cannot  change,  while  in  transi- 
tions associated  with  a polarization  corresponding  to  an  electric  dipole 
in  the  x-direction  or  y-direction,  m'z  must  change  by  ^.h. 

We  can  determine  more  accurately  the  state  of  polarization  of  the 
radiation  associated  with  a transition  in  which  mz  changes  by  ±h,  by 
considering  the  condition  for  the  non-vanishing  of  matrix  elements 
of  x-\-iy  and  x—iy.  We  have 

[mz,x+iy]  = y—ix  — —i(x+iy) 
or  mz(x+iy)  — (x+iy)(mz+h)  = 0, 

which  is  again  of  the  type  (81).  It  gives 

m'z—m"z—h  — 0 

as  the  condition  that  (m'z\x-\-iy\m"zy  shall  not  vanish.  Similarly, 

m'z—m"z-j-h  = 0 

is  the  condition  that  (m'z\x-iy\m"z > shall  not  vanish.  Hence 
(m'z\x—iy\m'z—hy  = 0 

or  (m'z\x\m'z—hy  = i<mz\y\m'z-hy  = (a+ib)eiu>t 

say,  a,  b,  and  oj  being  real.  The  conjugate  complex  of  this  is 
(m'z-h\x\mzy  = —i(m'z—h\y\m'zy  = (a-ib)e-^1. 

Thus  the  vector  \{(jn'z\l)\m'z—hy-\-(jnz — h\D\mzy),  which  determines 
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the  state  of  polarization  of  the  radiation  associated  with  transitions 
for  whieh  m"  = m'z—h,  has  the  following  three  components 
{{(m's\x\mz—%y  + (m't—K\x\m'sy} 

= £{(a + ib)eitai+ (a — i6)e-itu<}  — a cos  cot — b sin  wt, 
\{<m't\y\m'z-Ay-{-<jn't-h\y\m'z)}  • (83) 

= £t{— (a-{-ib)eiwt-\-(a—\ ib)e~itut}  — a sin  wt +6  cos  wt, 
^<jn'z\z\m'z—hy+im'z—h\z\m'zy}  = 0.  > 

From  the  form  of  these  components  we  see  that  the  associated  radia- 
tion moving  in  the  z-direction  will  be  circularly  polarized,  that 
moving  in  any  direction  in  the  rcy-plane  will  be  linearly  polarized  in 
this  plane,  and  that  moving  in  intermediate  directions  will  be 
elliptically  polarized.  The  direction  of  circular  polarization  for  radia- 
tion moving  in  the  z-direction  will  depend  on  whether  w is  positive 
or  negative,  and  this  will  depend  on  which  of  the  two  states  m'z  or 
m'z  = m'z—h  has  the  greater  energy. 

We  shall  now  determine  the  selection  rule  for  k.  We  have 

[k(k+h),z]  = [m|,z]+[m®,z] 

= ~ymx-mxy+xnlv+mvx 
= 2{mvx-mxy-\-ihz) 

= 2 (mux—ymx)  = 2 (xmy—mxy). 

Similarly,  [A;(fc-f-^),a:]  = 2 (ymz—mvz) 

and  [k(k+h),y]  = 2 (mxz-xmz). 

Hence 

[k(k^h),[k(k+h),z]] 

— 2[k(k+h),myx—mxy+ihz] 

— 2my[k(k + h) , x] — 2 mx[k(k + h) , y]  + 2 ih[k(k+h),  z] 

= 4 my(ymz—my  z)  — 4 mx(mx  z—xmz ) + 2 {k(k-\-h)z—zk(k  -f  fi)} 

= 4(mJz:+wi!,2/+m2z)mz— 4(m|-fm2+m*)z-f 

-f-  2{&  ( A: + h )z — z k ( k + h )} . 
From  (22)  mxx-\-myy+mzz  = 0 (84) 

and  hence 

[&(&+£),  = -2  {k(k+h)z+zk(k+h)}, 

which  gives 

k2(kJrk)2z-2k(k-\-h)zk(k+h)-\-zk1l(k-lrh)2— 

—2  h2{k(k-\-h)z-\-zk(k-\-h)}  = 0.  (85) 
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Similar  equations  hold  for  z and  y.  These  equations  are  of  the  re- 
quired type  (81),  and  give  us  the  selection  rule 
k'*(k'  + ft)2  - 2k'  (k’ + h)k"(k"  + h) + k"*(k"  + • h? - 

-2  h2k'(k'+h)-2h2k"(k"+h)  = 0, 

which  reduces  to 

{k'+k"+2h)(k'+n(k'-k"+h)(k'-k"-fi)  = o. 

A transition  can  take  place  between  two  states  k'  and  k"  only  if  one 
of  these  four  factors  vanishes. 

Now  the  first  of  the  factors,  (k'+k'+2h),  can  never  vanish  since 
the  eigenvalues  of  k are  all  positive  or  zero.  The  second,  (k'  + k ) can 
vanish  only  if  jfc'  = 0 and  k"  = 0.  But  transitions  between  two  states 
with  these  values  for  k cannot  occur  on  account  of  other  selection 
rules  as  may  be  seen  from  the  following  argument.  If  two  states 
(labelled  respectively  with  a single  prime  and  a double  prime)  are 
such  that  V = 0 and  V = 0,  then  from  (41)  and  the  corresponding 

' 0 and  m"x  = my  = mz  — 0. 


results  for  mx  and  m,,  mx  — my  — m. 


results  tor  anu.  i'vy,  ^ f 

The  selection  rule  for  ms  now  shows  that  the  matrix  elements  o 

x and  y referring  to  the  two  states  must  vanish,  as  the  value  ot  m 
does  not  change  during  the  transition,  and  the  similar  selection  rule 
for  m or  shows  that  the  matrix  element  of  z also  vanishes.  Thus 
transitions  between  the  two  states  cannot  occur.  Our  selection  ru  e 
for  k now  reduces  to 

(k'-k"+h){k'-k"-h)  - o, 

showing  that  k must  change  by  ±h.  This  selection  rule  may  be  written 

w-whr+v*-#  = o, 

and  since  this  is  the  condition  that  a matrix  element  shall 

not  vanish,  we  get  the  equation 

Wz—Vkzk+zkt—Wz  = 0 

or  [fc,[M]=-z,  (86) 

a result  which  could  not  easily  be  obtained  in  a more  direct  way. 

As  a final  example  we  shall  obtain  the  selection  rule  for  the  magni- 
tude K of  the  total  angular  momentum  M of  a general  atomic  system. 
Let  x,  y,  z be  the  coordinates  of  one  of  the  electrons.  We  must  obtain 
the  condition  that  the  (K',K")  matrix  element  of  x y or  z shall  not 
vanish.  This  is  evidently  the  same  as  the  condition  that  the  (K  ,K) 
matrix  element  of  A,,  A2,  or  A3  shall  not  vanish,  where  Ax,  A,,  and  A3 
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are  any  three  independent  linear  functions  of  x,  y.  and  z with  numeri-  p 
cal  coefficients,  or  more  generally  with  any  coefficients  that  commute  n 
with  K and  are  thus  represented  by  matrices  which  are  diagonal  with  p 


respect  to  K.  Let  K = M,x+Mvy+M,z,  4 

K = Mvz—Mzy—ihx, 

\ = Mzx—Mxz—ihy,  n 

Az  = Mxy—Myx—ihz.  ^ 

We  have  fh 

MxXx+MyXy+MzXz  = 2 ( Mx My z - Mx Mz y - ihMx x ) ir 


= 2 (MxMy-MyMx-ihMz)z  = 0 (87)  J 

xyz  F 

from  (29).  Thus  Xx,  A,,,  and  Xz  are  not  hnearly  independent  functions  fi' 
of  x,  y,  and  z.  Any  two  of  them,  however,  together  with  A0  are  three  A 
linearly  independent  functions  of  x,  y,  and  z and  may  be  taken  as  the  tl 
above  Ax,  A2,  A3,  since  the  coefficients  Mx,  My,  Mz  all  commute  with  K. 

Our  problem  thus  reduces  to  finding  the  condition  that  the  (K1,  K") 
matrix  elements  of  A0,  Xx,  Xy,  and  A2  shall  not  vanish.  The  physical  T 
meanings  of  these  A’s  are  that  A„  is  proportional  to  the  component  of  tl 
the  vector  (x,  y,  z)  in  the  direction  of  the  vector  M,  and  Xx,  Xy,  Xz  are  el 
proportional  to  the  Cartesian  components  of  the  component  of  (x,  y,  z)  tl 
perpendicular  to  M.  w 

Since  A0  is  a scalar  it  must  commute  with  K.  It  follows  that  only  in 
the  diagonal  elements  <_K'|A0|/l'>  of  A0  can  differ  from  zero,  so  the  H 
selection  rule  is  that  K cannot  change  so  far  as  A„  is  concerned.  Apply- 
ing (30)  to  the  vector  Xx,Xy,Xz,  we  have 

[Mz,Xx]  = Xy,  [Mz,Xy]  = —Xx,  [Mz,  AJ  = 0.  T 

These  relations  between  Mz  and  Ax,  Xy,  Xz  are  of  exactly  the  same  form  in 
as  the  relations  (23),  (24)  between  mz  and  x,y,z,  and  also  (87)  is  of 
the  same  form  as  (84) . The  dynamical  variables  Xx,Xy,Xz  thus  have  the  OI 
same  properties  relative  to  the  angular  momentum  M as  x,  y,  z have  m 
relative  to  m.  The  deduction  of  the  selection  rule  for  k when  the  jn 
electric  displacement  is  proportional  to  (x,  y,  z)  can  therefore  be  taken  pj] 
over  and  applied  to  the  selection  rule  for  K when  the  electric  displace- 
ment is  proportional  to  (Ax,  Xy,  Xz).  We  find  in  this  way  that,  so  far  as  jy 
Xx,  Xy,  Xz  are  concerned,  the  selection  rule  for  K is  that  it  must  change 
by  ±h. 

Collecting  results,  we  have  as  the  selection  rule  for  K that  it  must 
• change  by  0 or  We  have  considered  the  electric  displacement 
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produced  by  only  one  of  the  electrons,  but  the  same  selection  rule 
must  hold  for  each  electron  and  thus  also  for  the  total  electric  dis- 
placement. 


41.  The  Zeeman  effect  for  the  hydrogen  atom 

We  shall  now  consider  the  system  of  a hydrogen  atom  in  a uniform 
magnetic  field.  The  Hamiltonian  (57)  with  V — — e2/r,  which  describes 
the  hydrogen  atom  in  no  external  field,  gets  modified  by  the  magnetic 
field,  the  modification,  according  to  classical  mechanics,  consisting 
in  the  replacement  of  the  components  of  momentum,  px,  py,  pz,  by 
px-\-e/c.Ax,  py-\-e/c.Ayt  pz-\-e/c.Az,  where  Ax,  Ay,  A,  are  the  com- 
ponents of  the  vector  potential  describing  the  field.  For  a uniform 
field  of  magnitude  in  the  direction  of  the  2-axis  we  may  take 
A.x  = — \J=fy,  Ay  = \J4x,  Az  — 0.  The  classical  Hamiltonian  will 
then  be 


H 


1 

2m 


1 e u 

p.-T»y 


) +(’‘’+\r 


&A  +p\ 


r 


This  classical  Hamiltonian  may  be  taken  over  into  the  quantum 
theory  if  we  add  on  to  it  a term  giving  the  effect  of  the  spin  of  the 
electron.  According  to  experimental  evidence  and  according  to  the 
theory  of  Chapter  XI,  the  electron  has  a magnetic  moment  — eft/2mc . a, 
where  a is  the  spin  vector  of  § 37.  The  energy  of  this  magnetic  moment 
in  the  magnetic  field  will  be  ehJ^j2mc . oz.  Thus  the  total  quantum 
Hamiltonian  will  be 


H 


1 

2m 


+{pv+ll*xY+pi* 


eKM 
2 me 


(88) 


There  ought  strictly  to  be  other  terms  in  this  Hamiltonian  giving  the 
interaction  of  the  magnetic  moment  of  the  electron  with  the  electric 
field  of  the  nucleus  of  the  atom,  but  this  effect  is  small,  of  the  same 
order  of  magnitude  as  the  correction  one  gets  by  taking  relativistic 
mechanics  into  account,  and  will  be  neglected  here.  It  will  be  taken 
into  account  in  the  relativistic  theory  of  the  electron  given  in 
Chapter  XI. 

If  the  magnetic  field  is  not  too  large,  we  can  neglect  terms  involving 
A*2,  so  that  the  Hamiltonian  (88)  reduces  to 


B - ±(pi+K+t>l)-~ 


eJ¥ 
2 me 


(xpy-ypx)- 


2 me 


1 o 2 p W 

-(Pl+Pl+pl)--  + £(mz+,az). 


M 


3595.57 


(89)- 
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The  extra  terms  due  to  the  magnetic  field  are  now  eJ//2mc . (mz+haz). 
But  these  extra  terms  commute  with  the  total  Hamiltonian  and  are 
thus  constants  of  the  motion.  This  makes  the  problem  very  easy. 
The  stationary  states  of  the  system,  i.e.  the  eigenstates  of  the  Hamil- 
tonian (89),  will  be  those  eigenstates  of  the  Hamiltonian  for  no  field 
that  are  simultaneously  eigenstates  of  the  observables  mz  and  az,  or 
at  least  of  the  one  observable  mz-\-haz,  and  the  energy-levels  of  the 
system  will  be  those  for  the  system  with  no  field,  given  by  (80)  if 
one  considers  only  closed  states,  increased  by  an  eigenvalue  of 
eJ¥/2mc.  (mz-\-hoz).  Thus  stationary  states  of  the  system  with  no 
field  for  which  m2  has  the  numerical  value  tn'z,  an  integral  multiple 
of  h,  and  for  which  also  az  has  the  numerical  value  a'z  — ±1,  w ill  still 
be  stationary  states  when  the  field  is  applied.  Their  energy  will  be 
increased  by  an  amount  consisting  of  the  sum  of  two  parts,  a part 
tJ4j2mc.mz  arising  from  the  orbital  motion,  which  part  may  be  con- 
sidered as  due  to  an  orbital  magnetic  moment  —em'J2m,c,  and  a part 
eJ^j2mc . ha'z  arising  from  the  spin.  The  ratio  of  the  orbital  magnetic 
moment  to  the  orbital  angular  momentum  m'z  is  -r-e/2mc,  which  is 
half  the  ratio'  of  the  spin  magnetic  moment  to  the  spin  angular 
momentum.  This  fact  is  sometimes  referred  to  as  the  magnetic 
anomaly  of  the  spin. 

Since  the  energy-levels  now  involve  mz,  the  selection  rule  for  ml 
obtained  in  the  preceding  section  becomes  capable  of  direct  com- 
parison with  experiment.  We  take  a Heisenberg  representation  in 
which,  among  other  constants  of  the  motion,  mz  ahd  az  are  diagonal. 
The  selection  rule  for  mz  now  requires  mz  to  change  by  fi,  0,  or  —h, 
while  <j2,  since  it  commutes  with  the  electric  displacement,  will  not 
change  at  all.  Thus  the  energy  difference  between  the  two  states 
taking  part  in  the  transition  process  will  differ  by  an  amount 
efU¥/2mc,  0,  or  — ehJ!j2mc  from  its  value  for  no  magnetic  field. 
Hence,  from  Bohr’s  frequency  condition,  the  frequency  of  the 
associated  electromagnetic  radiation  will  differ  by  eJ^jlnmc,  0,  or 
— eJ^/inmc  from  that  for  no  magnetic  field.  This  means  that  each 
spectral  line  for  no  magnetic  field  gets  split  up  by  the  field  into  three 
components.  If  one  considers  radiation  moving  in  the  3-direction, 
then  from  (83)  the  two  outer  components  will  be  circularly  polarized, 
while  the  central  undisplaced  one  will  be  of  zero  intensity.  These 
results  are  in  agreement  with  experiment  and  also  with  the  classical 
theory  of  the  Zeeman  effect. 


VII 

PERTURBATION  THEORY 

42.  General  remarks 

In  the  preceding  chapter  exact  treatments  were  given  of  some  simple 
dynamical  systems  in  the  quantum  theory.  Most  quantum  problems, 
however,  cannot  be  solved  exactly  with  the  present  resources  of 
mathematics,  as  they  lead  to  equations  whose  solutions  cannot  be 
expressed  in  finite  terms  with  the  help  of  the  ordinary  functions  of 
analysis.  For  such  problems  one  can  often  use  a perturbation  method. 
This  consists  in  splitting  up  the  Hamiltonian  into  two  parts,  one  of 
which  must  be  simple  and  the  other  small.  The  first  part  may  then 
be  considered  as  the  Hamiltonian  of  a simplified  or  unperturbed 
system,  which  can  be  dealt  with  exactly,  and  the  addition  of  the 
second  will  then  require  small  corrections,  of  the  nature  of  a perturba- 
tion, in  the  solution  for  the  unperturbed  system.  The  requirement 
that  the  first  part  shall  be  simple  requires  in  practice  that  it  shall  not 
involve  the  time  explicitly.  If  the  second  part  contains  a small 
numerical  factor  e,  we  can  obtain  the  solution  of  our  equations  for 
the  perturbed  system  in  the  form  of  a power  series  in  e,  which,  pro- 
vided it  converges,  will  give  the  answer  to  our  problem  with  any 
desired  accuracy.  Even  when  the  series  does  not  converge,  the  first 
approximation  obtained  by  means  of  it  is  usually  fairly  accurate. 

There  are  two  distinct  methods  in  perturbation  theory.  In  one  of 
these  the  perturbation  is  considered  as  causing  a modification  of  the 
states  of  motion  of  the  unperturbed  system.  In  the  other  we  do  not 
consider  any  modification  to  be  made  in  the  states  of  the  unperturbed 
system,  but  we  suppose  that  the  perturbed  system,  instead  of  remain- 
ing permanently  in  one  of  these  states,  is  continually  changing  from 
one  to  another,  or  making  transitions , under  the  influence  of  the 
perturbation.  Which  method  is  to  be  used  in  any  particular  case 
depends  on  the  nature  of  the  problem  to  be  solved.  The  first  method 
is  useful  usually  only  when  the  perturbing  energy  (the  correction  in  the 
Hamiltonian  for  the  undisturbed  system)  does  not  involve  the  time 
explicitly,  and  is  then  applied  to  the  stationary  states.  It  can  be  used 
for  calculating  things  that  do  not  refer  to  any  definite  time,  such  as 
the  energy -levels  of  the  stationary  states  of  the  perturbed  system,  or, 
in  the  case  of  collision  problems,  the  probability  of  scattering  through 
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a given  angle.  The  second  method  must,  on  the  other  hand,  be  used 
for  solving  all  problems  involving  a consideration  of  time,  such  as 
those  about  the  transient  phenomena  that  occur  when  the  perturba- 
tion is  suddenly  applied,  or  more  generally  problems  in  which  the 
perturbation  varies  with  the  time  in  any  way  (i.e.  in  which  the  per- 
turbing energy  involves  the  time  explicitly).  Again,  this  second 
method  must  be  used  in  collision  problems,  even  though  the  per- 
turbing energy  does  not  here  involve  the  time  explicitly,  if  one 
wishes  to  calculate  absorption  and  emission  probabilities,  since  these 
probabilities,  unlike  a scattering  probability,  cannot  be  defined  with- 
out reference  to  a state  of  affairs  that  varies  with  the  time. 

One  can  summarize  the  distinctive  features  of  the  two  methods  by 
saying  that,  with  the  first  method,  one  compares  the  stationary  states 
of  the  perturbed  system  with  those  of  the  unperturbed  system;  with 
the  second  method  one  takes  a stationary  state  of  the  unperturbed 
bystem  and  sees  how  it  varies  with  time  under  the  influence  of  the 
perturbation. 

43.  The  change  in  the  energy -levels;  caused  by  a perturbation 

The  first  of  the  above-mentioned  methods  will  now  be  applied  to 
the  calculation  of  the  changes  in  the  energy -levels  of  a system  caused 
by  a perturbation.  We  assume  the  perturbing  energy,  like  the  Hamil- 
tonian for  the  unperturbed  system,  not  to  involve  the  tifhe  explicitly. 
Our  problem  has  a meaning,  of  course,  only  provided  the  energy -levels 
of  the  unperturbed  system  are  discrete  and  the  differences  between 
them  are  large  compared  with  the  changes  in  them  caused  by  the 
perturbation.  This  circumstance  results  in  the  treatment  of  perturba- 
tion problems  by  the  first  method  having  some  different  features 
according  to  whether  the  energy -levels  of  the  unperturbed  system  are 
discrete  or  continuous. 

Let  the  Hamiltonian  of  the  perturbed  system  be 

H = E+V,  (1) 

E being  the  Hamiltonian  of  the  unperturbed  system  and  V the  small 
perturbing  energy.  By  hypothesis  each  eigenvalue  H'  of  H lies  very 
close  to  one  and  only  one  eigenvalue  E'  of  E.  We  shall  use  the  same 
number  of  primes  to  specify  any  eigenvalue  of  H and  the  eigenvalue 
of  E to  which  it  lies  very  close.  Thus  we  shall  have  H"  differing  from 
E"  by  a small  quantity  of  order  V and  differing  from  E'  by  a quantity 
that  is  not  small  unless  E'  = E".  We  must  now  take  care  always  to 
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use  different  numbers  of  primes  to  specify  eigenvalues  of  H and  E 
which  we  do  not  want  to  lie  very  close  together. 

To  obtain  the  eigenvalues  of  H,  we  have  to  solve  the  equation 

H\H'y  = H'\H'y 

or  (H'-E)\H'y  = V\H'y.  (2) 

Let  |0>  be  an  eigenket  of  E belonging  to  the  eigenvalue  E'  and 
suppose  the  | H'y  and  H'  that  satisfy  (2)  to  differ  from  |0)  and  E 
only  by  small  quantities  and  to  be  expressed  as 

\H'y  = |0>+|1>+|2>+...,  ] (3) 

H'  = E'+a1+ai+...,  > 

where  |1>  and  aL  are  of  the  first  order  of  smallness  (i.e.  the  same  order 
as  F),  |2>  and  a2  are  of  the  second  order,  and  so  on.  Substituting 
these  expressions  in  (2),  we  obtain 

{^'_£+ai+a2-F...}{|0)+|l>+|2>  + ...}  = F{|0>+|1>+— }• 

If  we  now  separate  the  terms  of  zero  order,  of  the  first  order,  of  the 
second  order,  and  so  on,  we  get  the  following  set  of  equations, 

(E'-E)  |0>  = 0, 

(E'-Em+a^oy  = F|0>, 

(E'-E)\ 2>+a1|l>+a2|0>  = F|l>, 

The  first  of  these  equations  tells  us,  what  we  have  already  assumed, 
that  |0>  is  an  eigenket  of  E belonging  to  the  eigenvalue  E'.  The  others 
enable  us  to  calculate  the  various  corrections  |1>,  |2>,...,  avait... . 

For  the  further  discussion  of  these  equations  it  is  convenient  to 
introduce  a representation  in  which  E is  diagonal,  i.e.  a Heisenberg 
representation  for  the  unperturbed  system,  and  to  take  E itself  as 
one  of  the  observables  whose  eigenvalues  label  the  representatives. 
Let  the  others,  in  the  event  of  others  being  necessary,  as  is  the  case 
when  there  is  more  than  one  eigenstate  of  E belonging  to  any  eigen- 
value, be  called  jS’s.  A basic  bra  is  then  (E") S"|  Since  |0>  is  an 
eigenket  of  E belonging  to  the  eigenvalue  E',  we  have 

<i?TI0>  = Sjrjr/OS").  (r,) 

where /(jS")  is  some  function  of  the  variables  /3".  With  the  help  of  this 
result  the  second  of  equations  (4),  written  in  terms  of  representatives, 
becomes 

(E'-E'KETW+a, SjnrflP)  = 2 <ET\V\E'f3'yf(fl’).  (6) 
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Putting  E"  = E'  here,  we  get 

«i  nn  = 2<*7n*ww)-  (?) 

Equation  (7)  is  of  the  form  of  the  standard  equation  in  the  theory 
of  eigenvalues,  so  far  as  the  variables  /?'  are  concerned.  It  shows  that 
the  various  possible  values  for  a1  are  the  eigenvalues  of  the  matrix 
(E'f}"\V\E’P'y.  This  matrix  is  a part  of  the  representative  of  the 
perturbing  energy  in  the  Heisenberg  representation  for  the  unper- 
turbed system,  namely,  the  part  consisting  of  those  elements  that 
refer  to  the  same  unperturbed  energy-level  E'  for  their  row  and 
column.  Eacli  of  these  values  for  ax  gives,  to  the  first  order,  an  energy- 
level  of  the  perturbed  system  lying  close  to  the  energy -level  E'  of  the 
unperturbed  system. f There  may  thus  be  several  energy-levels  of  the 
perturbed  system  lying  close  to  the  one  energy -level  E'  of  the  unper- 
turbed system,  their  number  being  anything  not  exceeding  the 
number  of  independent  states  of  the  unperturbed  system  belonging 
to  the  energy-level  E'.  In  this  way  the  perturbation  may  cause  a 
separation  or  partial  separation  of  the  energy-levels  that  coincide 
at  E'  for  the  unperturbed  system. 

Equation  (7)  also  determines,  to  the  zero  order,  the  representatives 
(E"/3"\0y  of  the  stationary  states  of  the  perturbed  system  belonging 
to  energy-levels  lying  close  to  E',  any  solution  /(/S')  of  (7)  substituted 
in  (5)  giving  one  such  representative.  Each  of  these  stationary  states 
of  the  perturbed  system  approximates  to  one  of  the  stationary  states 
of  the  unperturbed  system,  but  the  converse,  that  each  stationary 
state  of  the  unperturbed  system  approximates  to  one  of  the  stationary 
states  of  the  perturbed  system,  is  not  true,  since  the  general 
stationary  state  of  the  unperturbed  system  belonging  to  the  energy- 
level  E'  is  represented  by  the  right-hand  side  of  (5)  with  an  arbitrary 
function  /(/?").  The  problem  of  finding  which  stationary  states  of 
the  unperturbed  system  approximate  to  stationary  states  of  the 
perturbed  system,  i.e.  the  problem  of  finding  the  solutions  /( jS')  of 
(7),  corresponds  to  the  problem  of  ‘secular  perturbations’  in  classical 
mechanics.  It  should  be  noted  that  the  above  results  are  indepen- 
dent of  the  values  of  all  those  matrix  elements  of  the  perturbing 


t To  distinguish  these  energy -levels  one  from  another  we  should  require  some 
more  elaborate  notation,  since  according  to  the  present  notation  they  must  all  be 
specified  by  the  same  number  of  primes,  namely  by  the  number  of  primes  specifying 
the  energy -level  of  the  unperturbed  system  from  which  they  arise.  For  our  present 
purposes,  however,  this  more  elaborate  notation  is  not  required. 
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energy  which  refer  to  two  different  energy -levels  of  the  unperturbed 


system.  * 

Let  us  see  what  the  above  results  become  in  the  specially  simple  case 
when  there  is  only  one  stationary  state  of  the  unperturbed  system 
belonging  to  each  energy -level. f In  this  case  E alone  fixes  the  repre- 
sentation, no  j8’s  being  required.  The  sum  in  (7)  now  reduces  to  a 
single  term  and  we  get 

8 a,  = (E'\V\E'}.  (8) 

There  is  only  one  energy -level  of  the  perturbed  system  lying  close  to 
any  energy -level  of  the  unperturbed  system  and  the  change  in  energy 
is  equal,  in  the  first  order,  to  the  corresponding  diagonal  element  of  the 
perturbing  energy  in  the  Heisenberg  representation  for  the  unperturbed 
system,  or  to  the  average  value  of  the  perturbing  energy  for  the  correspond- 
ing unperturbed  state.  The  latter  formulation  of  the  result  is  the  same 
as  in  classical  mechanics  when  the  unperturbed  system  is  multiply 
periodic. 

We  shall  proceed  to  calculate  the  second-order  correction  a2  in 
the  energy-level  for  the  case  when  the  unperturbed  system  is  non- 
degenerate.  Equation  (5)  lor  this  case  reads 

<£"|0>  = §f;e> 


with  neglect  of  an  unimportant  numerical  factor,  and  equation  (6) 
reads  (E' -E”KE"\l)+al8E.E.  = <E"\V\E’y. 

This  gives  us  the  value  of  \E  [!)•  when  E ^ E , namely 


<£"|1> 


< E"\V\E’> 

' E'-E"  ' 


(9) 


The  third  of  equations  (4),  written  in  terms  of  representatives, 
becomes 

(E' — E")(,E” \2} -\-a1{E" \ iy  +a2hE-E.  = 2 (E"\V\E'"y{E'"\\y. 

E'" 

Putting  E"  = E'  here,  we  get 

ai<E’\iy+a2  = Z<E'\V\E'"y<E'"\iy, 

Em 

which  reduces,  with  the  help  of  (8),  to 

«2=  2 <E'\v\E"y(E”\\y. 

E'*E' 

t A system  with  only  one  stationary  state  belonging  to  each  energy-level  is  often 
called  non-degenerate  and  one  with  two  or  more  stationary  states  belonging  to  an 
energy-level  is  called  degenerate,  although  these  words  are  not  very  appropriate  from 
the  modern  point  of  view. 
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Substituting  for  <£"11)  from  (9),  we  obtain  finally 

(E'\V\E"XE"\V\E'} 
E'-E"  ‘ ’ 


■—  2 


Em*E' 


giving  for  the  total  energy  change  to  the  second  order 

, NT  <£'|F|£"><£"|Fi£'> 

«i+«i  = <£  \\  \E  >+  2 W-~E" 


(10) 


E"  E 


The  method  may  be  developed  for  the  calculation  of  the  higher 
approximations  if  required.  General  recurrence  formulas  giving  the 
nth  order  corrections  in  terms  of  those  of  lower  order  have  been 
obtained  by  Born,  Heisenberg,  and  Jordan.-}- 


44.  The  perturbation  considered  as  causing  transitions 

We  shall  now  consider  the  second  of  the  two  perturbation  methods 
mentioned  in  § 42.  We  suppose  again  that  we  have  an  unperturbed 
system  governed  by  a Hamiltonian  E which  does  not  involve  the 
time  explicitly,  and  a perturbing  energy  F which  can  now  be  an 
arbitrary  function  of  the  time.  The  Hamiltonian  for  the  perturbed 
system  is  again  H = E+V.  For  the  present  method  it  does  not 
make  any  essential  difference  whether  the  energy-levels  of  the 
unperturbed  system,  i.e.  the  eigenvalues  of  E,  form  a discrete  oi 
continuous  set.  We  shall,  however,  take  the  discrete  case,  for 
definiteness.  We  shall  again  work  with  a Heisenberg  representation 
for  the  unperturbed  system,  but  as  there  will  now  be  no  advantage  in 
taking  E itself  as  one  of  the  observables  whose  eigenvalues  label  the 
representatives,  we  shall  suppose  we  have  a general  set  of  ex’s  to  label 
the  representatives. 

Let  us  suppose  that  at  the  initial  time  t0  the  system  is  in  a state  for 
which  the  ot’s  certainly  have  the  values  a'.  The  ket  corresponding  to 
this  state  is  the  basic  ket  |<F>.  If  there  were  no  perturbation,  i.e.  if  the 
Hamiltonian  were  E,  this  state  would  be  stationary.  The  perturba- 
tion causes  the  state  to  change.  At  time  t the  ket  corresponding  to  the 
state  in  Schrodinger’s  picture  will  be  T |a'>,  according  to  equation  (1) 
of  § 27.  The  probability  of  the  a’s  then  having  the  values  <x"  is 

= |<a"|2V>|2.  (n) 

For  a"  x <*',  P(oc'oc")  is  the  probability  of  a transition  taking  place 
from  state  to  state  a”  during  the  time  interval  <0  t,  while  P (*'*') 


t Z.f.  Fhysik,  35  (1925),  565. 
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is  the  probability  of  no  transition  taking  place  at  all.  The  sum  of 
P(a'a")  for  all  Oi”  is,  of  course,  unity. 

Let  us  now  suppose  that  initially  the  system,  instead  of  being 
certainly  in  the  state  oc , is  in  one  or  other  of  various  states  a with 
the  probability  IP  for  each.  The  Gibbs  density  corresponding  to  this 
distribution  is,  according  to  (68)  of  § 33 

P = I (12> 

a' 

At  time  t,  each  ket  |«'>  will  have  changed  to  T |a'>  and  each  bra  <a'| 
to  (a  | T,  so  p will  have  changed  to 

P,=  2 TK>pa.<«'|y.  (13) 

a' 

The  probability  of  the  ot’s  then  having  the  values  a"  will  be,  from 
(73)  of  § 33,  <a"|ft|a»>  = ^ <«"!T|a'>Pa.<“'l?l“"> 

a' 

= 2P*P(«’*')  <14) 

ac' 

with  the  help  of  (11).  This  result  expresses  that  the  probability  of 
the  system  being  in  the  state  <x"  at  time  t is  the  sum  of  the  probabilities 
of  the  system  being  initially  in  any  state  a'  ^ a",  and  making  a transi- 
tion from  state  a to  state  a"  and  the  probability  of  its  being  initially 
in  the  state  a"  and  making  no  transition.  Thus  the  various  transition 
probabilities  act  independently  of  one  another,  according  to  the 
ordinary  laws  of  probability. 

The  whole  problem  of  calculating  transitions  thus  reduces  to  the 
determination  of  the  probability  amplitudes  <a"|T|a'>.  These  can  be 
worked  out  from  the  differential  equation  for  T,  equation  (6)  of  § 27,  or 


ihcLTIdt  = HT  = (E+V)T.  (15) 

The  calculation  can  be  simplified  by  working  with 

y*  _ eiE(t-ta)!hq\  (16) 

We  have  ihdT*jdt  = ( - ET +ihdT jdt) 

= emt-h>)l*yT  = V*T*,  (17) 

where  V*  = «<w-M*7e-jUw-U'*,  (18) 


i.e.  V*  is  the  result  of  applying  a certain  unitary  transformation  to  V . 
Equation  (17)  is  of  a more  convenient  form  than  (15),  because  (17) 
makes  the  change  in  T*  depend  entirely  on  the  perturbation  V,  and 
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for  V — 0 it  would  make  T*  equal  its  initial  value,  namely  unity. 
We  have  from  (16) 

<c/|T*|cO  = 

SO  that  P(aV)  = |<a"|T*|a')|21  (19) 

showing  that  T*  and  T are  equally  good  for  determining  transition 
probabilities. 

Our  work  up  to  the  present  has  been  exact.  We  now  assume  V is 
a small  quantity  of  the  first  order  and  express  T*  in  the  form 

T*=  l + T*+T*  + ..„  (20) 

where  Tf  is  of  the  first  order,  T*  is  of  the  second,  and  so  on.  Substi- 
tuting (20)  into  (17)  and  equating  terms  of  equal  order,  we  get 

■ihdT*l'dt  = V*, 
ihdT*jdt  = V*T*, 


From  the  first  of  these  equations  we  obtain 

t 

T*  = j V*(t')  dt', 

t. 

from  the  second  we  obtain 

t v 

T*  = —h~2  J V*(t')  dt’  j V*(t ”)  dt", 


(21) 


(22) 


(23) 

lo 

and  so  on.  For  many  practical  problems  it  is  sufficiently  accurate  to 
retain  only  the  term  T*,  which  gives  for  the  transition  probability 
P( a'at")  with  a"  at' 

P(aV)  =;  h~2 1 

^0 


a-2 


<a"j  J V*(t')  dt'\a'y 

*0 

t 

J w\v*(t')\<x'y  dt' 


(24) 


We  obtain  in  this  way  the  transition  probability  to  the  second  order 
of  accuracy.  The  result  depends  only  on  the  matrix  element 
<<x"|F*((')|c/>  of  V*{t')  referring  to  the  two  states  concerned,  with  t' 
going  from  tQ  to  t.  Since  V*  is  real,  like  V, 

<<*"|F*(OK>  = <a'lF*(l')|«"> 
and  hence  P(a’a")  = P(a'a') 

to  the  second  order  of  accuracy. 


(25) 
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Sometimes  one  is  interested  in  a transition  a'  ->  a"  such  that  the 
matrix  element  <«'|  vanishes,  or  is  small  compared  with  other 

matrix  elements  of  V*.  It  is  then  necessary  to  'work  to  a higher 
accuracy.  If  we  retain  only  the  terms  Tf  and  T* , we  get,  for  cx"  ^ <*', 


= h -2 


J <a"|F*(t')|a'>  dt'  - 


.ih- 1 2 f (<x"\v*(t')W"ydt'  f w\v*(t")\<x'ydt“ 

0Lm^  ol\olk  J j 


. (26) 


The  terms  a'"  = a'  and  a'"  = a"  are  omitted  from  the  sum  since  they 
are  small  compared  with  other  terms  of  the  sum,  on  account  of  the 
smallness  of  <<x*|7*|ot'>.  To  interpret  the  result  (26),  we  may  suppose 
that  the  term  t 

7<a'|F*(m«'>*'  (27) 

*0 

gives  rise  to  a transition  directly  from  state  a.  to  state  a , while  the 
term  t i' 

~ifrl  J <cf  |F*(t')l«">  dt'  J <<*"'l^*(OK>  dt"  (28) 

fo  (° 

gives  rise  to  a transition  from  state  <x  to  state  a'",  followed  by  a 
transition  from  state  to  state  a".  The  state  <x"  is  called  an  inter- 
mediate state  in  this  interpretation.  We  must  add  the  term  (27)  to  the 
various  terms  (28)  corresponding  to  different  intermediate  states 
and  then  take  the  square  of  the  modulus  of  the  sum,  which  means 
that  there  is  interference  between  the  different  transition  processes 
the  direct  one  and  those  involving  intermediate  states— and  one  can- 
not give  a meaning  to  the  probability  for  one  of  these  processes  by 
itself.  For  each  of  these  processes,  however,  there  is  a probability 
amplitude.  If  one  carries  out  the  perturbation  method  to  a higher 
degree  of  accuracy,  one  obtains  a result  which  can  be  interpreted 
similarly,  with  the  help  of  more  complicated  transition  processes 
involving  a succession  of  intermediate  states. 


45.  Application  to  radiation 

In  the  preceding  section  a general  theory  of  the  perturbation  of  an 
atomic  system  was  developed,  in  which  the  perturbing  energy  could 
vary  with  the  time  in  an  arbitrary  way.  A perturbation  of  this 
kind  can  be  realized  in  practice  by  allowing  incident  electromagnetic 
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radiation  to  fall  on  the  system.  Let  us  see  what  our  result  (24)  reduces 
to  in  this  case. 

If  we  neglect  the  effects  of  the  magnetic  field  of  the  incident  radia- 
tion, and  if  we  further  assume  that  the  wave-lengths  of  the  harmonic 
components  of  this  radiation  are  all  large  compared  with  the  dimen- 
sions of  the  atomic  system,  then  the  perturbing  energy  is  simply  the 
scalar  product  y = (D  (29) 

where  D is  the  total  electric  displacement  of  the  system  and  £ is 
the  electric  force  of  the  incident  radiation.  We  suppose  £ to  be  a 
given  function  of  the  time.  If  we  take  for  simplicity  the  case  when 
the  incident  radiation  is  plane  polarized  with  its  electric  vector  in 
a certain  direction  and  let  D denote  the  Cartesian  component  of  D 
in  this  direction,  the  expression  (29)  for  V reduces  to  the  ordinary 
product  V = D8, 

where  £ is  the  magnitude  of  the  vector  £.  The  matrix  elements  of 
Fare  <ot'|F|«'>  = <(/|Z)|a'>£, 

since  6 is  a number.  The  matrix  element  <a*|D|a'>  is  independent 
of  t.  From  (18) 

<oi"|F*(0k>  - <£(<), 

and  hence  the  expression  (24)  for  the  transition  probability  becomes 


P(oc'a")  — #-2|<a 


:"|X>|«')|2  j 


If  the  incident  radiation  during  the  time  interval  t0  to  t is  resolved 
into  its  Fourier  components,  the  energy  crossing  unit  area  per  unit 
frequency  range  about  the  frequency  v will  be,  according  to  classical 
electrodynamics,  t 

Ev  “ ^ dt’  2-  (31) 

to 

Comparing  this  with  (30),  we  obtain 

P(rcV')  = 277C-1/i-2|<a"|-C)l“'>l2^.  (32) 

where  v = \E"-E'\/h.  (33) 

From  this  result  we  see  in  the  first  place  that  the  transition  proba- 
bility depends  only  on  that  Fourier  component  of  the  incident  radia- 
tion whose  frequency  v is  connected  with  the  change  of  energy  by  (33). 
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This  gives  us  Bohr’s  Frequency  Condition  and  shows  how  the  ideas 
of  Bohr’s  atomic  theory,  which  was  the  forerunner  of  quantum 
mechanics,  can  be  fitted  in  with  quantum  mechanics. 

The  present  elementary  theory  does  not  tell  us  anything  about  the 
energy  of  the  field  of  radiation.  It  would  be  reasonable  to  assume, 
though,  that  the  energy  absorbed  or  liberated  by  the  atomic  system 
in  the  transition  process  comes  from  or  goes  into  the  component  of 
the  radiation  with  frequency  v given  by  (33).  This  assumption  will 
be  justified  by  the  more  complete  theory  of  radiation  given  in 
Chapter  X.  The  result  (32)  is  then  to  be  interpreted  as  the  proba- 
bility of  the  system,  if  initially  in  the  state  of  lower  energy,  absorb- 
ing radiation  and  being  carried  to  the  upper  state,  and  if  initially  in 
the  upper  state,  being  stimulated  by  the  incident  radiation  to  emit 
and  fall  to  the  lower  state.  The  present  theory  does  not  account  for 
the  experimental  fact  that  the  system,  if  in  the  upper  state  with  no 
incident  radiation,  can  emit  spontaneously  and  fall  to  the  lower  state, 
but  this  also  will  be  accounted  for  by  the  more  complete  theory  of 

Chapter  X.  . 

The  existence  of  the  phenomenon  of  stimulated  emission  was  in- 
ferred by  Einstein, f long  before  the  discovery  of  quantum  mechanics, 
from  a consideration  of  statistical  equilibrium  between  atoms  and  a 
field  of  black-body  radiation  satisfying  Planck’s  law.  Einstein  showed 
that  the  transition  probability  for  stimulated  emission  must  equal 
that  for  absorption  between  the  same  pair  of  states,  in  agreement 
with  the  present  quantum  theory,  and  deduced  also  a relation  con- 
necting this  transition  probability  with  that  for  spontaneous  emission, 
which  relation  is  in  agreement  with  the  theory  of  Chapter  X. 

The  matrix  element  <«"|D|a'>  in  (32)  plays  the  part  of  the  ampli- 
tude of  one  of  the  Fourier  components  of  D in  the  classical  theory  of 
a multiply -periodic  system  interacting  with  radiation.  In  fact  it  was 
the  idea  of  replacing  classical  Fourier  components  by  matrix  elements 
which  led  Heisenberg  to  the  discovery  of  quantum  mechanics  m 1925. 
Heisenberg  assumed  that  the  formulas  describing  the  interaction  with 
radiation  of  a system  in  the  quantum  theory  can  be  obtained  from 
the  classical  formulas  by  substituting  for  the  Fourier  components  of 
the  total  electric  displacement  of  the  system  the  corresponding  matrix 
elements.  According  to  this  assumption  applied  to  spontaneous  emis- 
sion, a system  having  an  electric  moment  D will,  when  in  the  state 
I j-  Einstein,  Phys.  Zeits.  18  (1917),  121. 
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cc',  spontaneously  emit  radiation  of  frequency  v — (E’  — E")/h,  where 
j E"  is  an  energy-level,  less  than  E' , of  some  state  a",  at  the  rate 

4(2w^|<a"iDja,>|2  (34) 

u C 

The  distribution  of  this  radiation  over  the  different  directions  of 
emission  and  its  state  of  polarization  for  each  direction  will  be  the 
same  as  that  for  a classical  electric  dipole  of  moment  equal  to  the 
real  part  of  <a"|D|a'>.  To  interpret  this  rate  of  emission  of  radiant 
energy  as  a transition  probability,  we  must  divide  it  by  the  quantum 
of  energy  of  this  frequency,  namely  hv,  and  call  it  the  probability  per 
unit  time  of  this  quantum  being  spontaneously  emitted,  with  the 
atomic  system  simultaneously  dropping  to  the  state  a"  of  lower 
energy.  These  assumptions  of  Heisenberg  are  justified  by  the  present 
radiation  theory,  supplemented  by  the  spontaneous  transition  theory 
of  Chapter  X. 

46.  Transitions  caused  by  a perturbation  independent  of  the 
time 

The  perturbation  method  of  § 44  is  still  valid  when  the  perturbing 
energy  F does  not  involve  the  time  t explicitly.  Since  the  total 
Hamiltonian  H in  this  case  does  not  involve  t explicitly,  we  could 
now,  if  desired,  deal  with  the  system  by  the  perturbation  method  of 
§ 43  and  find  its  stationary  states.  Whether  this  method  would  be 
convenient  or  not  would  depend  on  what  we  want  to  find  out  about 
the  system.  If  what  we  have  to  calculate  makes  an  explicit  reference 
to  the  time,  e.g.  if  we  have  to  calculate  the  probability  of  the  system 
being  in  a certain  state  at  one  time  when  we  are  given  that  it  is  in  a 
certain  state  at  another  time,  the  method  of  § 44  would  be  the  more 
convenient  one. 

Let  us  see  what  the  result  (24)  for  the  transition  probability  becomes 
when  F does  not  involve  t explicitly  and  let  us  take  <0  = 0 to  simplify 
the  writing.  The  matrix  element  <-a"|F|a'>  is  now  independent  of  t, 
and  from  (18)  = <a"|F| (35) 

i eHE'-E'W_  I 

so  j<«-  |Mok>*’  = <«irK>:p*-7^, 

provided  E"  7=  E’ . Thus  the  transition  probability  (24)  becomes 
* P(a'a')  = |<a''|F|Ce')|2[ei(E'-£W-l][e-^'-£'Wft-l  ]/(E"-E')2 

= 2\(<x"\V\<x')\i[l-cos{(E"-E')tih}]l(E"-E')2. 


(36) 
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If  E”  differs  appreciably  from  E'  this  transition  probability  is  small 
and  remains  so  for  all  values  of  t.  This  result  is  required  by  the  law 
of  the  conservation  of  energy.  The  total  energy  H is  constant  and 
hence  the  proper-energy  E (i.e.  the  energy  with  neglect  of  the  part 
V due  to  the  perturbation),  being  approximately  equal  to  H,  must 
be  approximately  constant.  This  means  that  if  E initially  has  the 
numerical  value  E' , at  any  later  time  there  must  be  only  a small 
probability  of  its  having  a numerical  value  differing  considerably 
from  E’ . 

On  the  other  hand,  when  the  initial  state  at  is  such  that  there  exists 
another  state  a"  having  the  same  or  very  nearly  the  same  proper- 
energy  E,  the  probability  of  a transition  to  the  final  state  a"  may  be 
quite  large.  The  case  of  physical  interest  now  is  that  in  which  there 
is  a continuous  range  of  final  states  a"  having  a continuous  range  of 
proper-energy  levels  E"  passing  through  the  value  E'  of  the  proper- 
energy  of  the  initial  state.  The  initial  state  must  not  be  one  of  the 
continuous  range  of  final  states,  but  may  be  either  a separate  discrete 
state  or  one  of  another  cont  inuous  range  of  states.  We  shall  now  have, 
remembering  the  rules  of  § 18  for  the  interpretation  of  probability 
amplitudes  with  continuous  ranges  of  states,  that,  with  P(a'a") 
having  the  value  (36),  the  probability  of  a transition  to  a final  state 
within  the  small  range  a"  to  a" + da*' will  be  P{ a'a")  da"  if  the  initial 
state  a'  is  discrete  and  will  be  proportional  to  this  quantity  if  a'  is 
one  of  a continuous  range. 

We  may  suppose  that  the  a’s  describing  the  final  state  consist  of 
E together  with  a number  of  other  dynamical  variables  p,  so  that  we 
have  a representation  like  that  of  § 43  for  the  degenerate  case.  (The 
P’s,  however,  need  have  no  meaning  for  the  initial  state  a'.)  We  shall 
suppose  for  definiteness  that  the  P’s  have  only  discrete  eigenvalues. 
The  total  probability  of  a transition  to  a final  state  a"  for  which  the 
P’s  have  the  values  p"  and  E has  any  value  (there  will  be  a strong 
probability  of  its  having  a value  near  the  initial  value  E1)  will  now 
be  (or  be  proportional  to) 

J P(a'a")  dE" 

00 

= 2 J \<E"P"\V\cc,'>\z[l-cos{(E"-E')tlh}]l{E”-E')*dE"  (37) 

— CO 

OO 

-■—2 th~l  J | \ P’ -)- hxjt, /J" | F |a’> p[  1 — cos x]jx2  dx 
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if  one  makes  the  substitution  ( E"-E')t/h  = *.  For  large  values  oft 
this  reduces  to 

2i&“1|<j£'/?"|F|a'>|2  J [1  — cos x]/x2  dx 
— 00 

= 2rrth-1\iE'^"\V\cx'')\i.  (38) 

Thus  the  total  probability  up  to  time  t of  a transition  to  a final  state 
or  which  the  /J’s  have  the  values  ft"  is  proportional  to  t.  There  is 
therefore  a definite  probability  coefficient , or  probability  per  unit  time, 
for  the  transition  process  under  consideration,  having  the  value 

2w*-»|<tf'j8'|F|«'>l1.  (39) 

It  is  proportional  to  the  square  of  the  modulus  of  the  matrix  element, 
associated  with  this  transition,  of  the  perturbing  energy. 

If  the  matrix  element  |F|ot'>  is  small  compared  with  other 
matrix  elements  of  F,  we  must  work  with  the  more  accurate  formula 
(26).  We  have  from  (35) 

l * 

J <a"|F*(t')|a'">  dt'  f <c*"'|  V*(t")\a'y  dt" 

0 0 

t v 

= <a"|F|a'"><a:"'|F|a;'>  j e^m  dp  J dt" 


<a"[F[a'"><ot>"|  F|a'> 
i(E"'  — E')jfi 


t 

- J {eW-Ey'IK—eW-E'V IK}  dp' 

n 


^ °r  E close  to  E’ , only  the  first  term  in  the  integrand  here  gives  rise 
to  a transition  probability  of  physical  importance  and  the  second 
term  may  be  discarded.  Using  this  result  in  (26)  ye  set 
P(aV) 

<a"|  F |a'"><a"'  | F |a'>  1 2 1 — COS {{E"—E’)t/fi} 
E’”  - E'  


!<«'|F|«'>-  2 


(E"—E’)* 

which  replaces  (36).  Proceeding  as  before,  we  obtain  for  the  transi- 
tion probability  per  unit  time  to  a final  state  for  which  the  S’ s have 
the  values  and  E has  a value  close  to  its  initial  value  E' 


2n 

j 


<E’ffi\vwy-  V <E'ffiWW"yw'\vwy 

K"'_  ny 


(40) 


E"'-E' 

This  formula  shows  how  intermediate  states,  differing  from  the  initial 
state  and  final  state,  play  a role  in  the  determination  of  a probability 
coefficient.  J 
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! 1 order  that  the  approximations  used  in 

root  ,liav  Ke  valid,  while  it  must  not  be  excessively  6 

(38)  maj  au  , ^ break  down.  In  fact  one  could  make 

theprobahUity  (38)  greater  than  unity  by  taking « * 

L‘  difficulty  in  t satisfying  both  these  conditions  a, multaneonsly 
provided  the  perturbing  energy  V is  sufficiently  small. 

47  The  anomalous  Zeeman  effect  f R 

one  of  the  simplest  examples  of  the  „ 

i8  the  calculation  of  the  first-order  “ *^“'m  ^f  a hydrogen 

atom  caused  by  a uniform  magnetic  fi.^The  P^b  Jf  ^ ■ 4[ 

atpge  first  of  all  consider  the  atom  in  the  absence  of f the -magnetic 
field  and  look  for  constants  of  the  motion  or  qua 
approximately  constants  „ the  mot, on.  J^tcdal  : angu^  ^ ^ 

‘""tel  This  »”ullre ’momentum  miy  be  regarded  a,  the  sum  of  two 
angular  momentum 

and  the  total  spin  angular  momentum,  s say 

NOW  the  effect  of  the  spin  magn toutob  forces  and 
electrons  is  small  compare  wi  e approximation 

.<  The  mas"tud“' 
1,' s,  and  j say,  of  l,s,  and  j will  be  given  by 

z+p  = (ll+ll+ll+W2)*’ 

s+ih  = (4+4+^+i*2)t* 

j+p  = 

N 
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corresponding  to  equation  (39)  of  § 36.  They  commute  with  each 
other,  and  from  (47)  of  § 36  we  see  that  with  given  numerical  values 
for  l and  s the  possible  numerical  values  for  j are 
1 + 8,  1+8— h,  ....  \l— «|. 

Let  us  consider  a stationary  state  for  which  l,  s,  and  j have  definite 
numerical  values  in  agreement  with  the  above  scheme.  The  energy 
of  this  state  will  depend  on  l,  but  one  might  think  that  with  neglect 
of  the  spin  magnetic  moments  it  would  be  independent  of  s,  and 
also  of  the  direction  of  the  vector  s relative  to  1,  and  thus  of  j.  It  will 
be  found  in  Chapter  IX,  however,  that  the  energy  depends  very  much 
on  the  magnitude  s of  the  vector  s,  although  independent  of  its 
direction  when  one  neglects  the  spin  magnetic  moments,  on  account 
of  certain  phenomena  arising  from  the  fact  that  the  electrons  are 
indistinguishable  one  from  another.  There  are  thus  different  energy- 
levels  of  the  system  for  each  different  value  of  l and  s.  This  means 
that  l and  s are  functions  of  the  energy,  according  to  the  general 
definition  of  a function  given  in  § 11,  since  the  l and  a of  a stationary 
state  are  fixed  when  the  energy  of  that  state  is  fixed. 

We  can  now  take  into  account  the  effect  of  the  spin  magnetic 
moments,  treating  it  as  a small  perturbation  according  to  the  method 
of  § 43.  The  energy  of  the  unperturbed  system  will  still  be  approxi- 
mately a constant  of  the  motion  and  hence  l and  s,  being  functions 
of  this  energy,  will  still  be  approximately  constants  of  the  motion. 
The  directions  of  the  vectors  1 and  s,  however,  not  being  functions  of 
the  unperturbed  energy,  need  not  now  be  approximately  constants 
of  the  motion  and  may  undergo  large  secular  variations.  Since  th^ 
vector  j is  constant,  the  only  possible  variation  of  1 and  8 is  a pre- 
cession about  the  vector  j.  We  thus  have  an  approximate  model  of 
the  atom  consisting  of  the  two  vectors  1 and  s of  constant  lengths 
processing  about  their  sum  j,  which  is  a fixed  vector.  The  energy  is 
determined  mainly  by  the  magnitudes  of  1 and  8 and  depends  only 
slightly  on  their  relative  directions,  specified  by  j.  Thus  states  with 
the  same  l and  a and  different  j will  have  only  slightly  different 
energy-levels,  forming  what  is  called  a multiplet  term. 

Let  us  now  take  this  atomic  model  as  our  unperturbed  system  and 
suppose  it  to  be  subjected  to  a uniform  magnetic  field  of  magnitude  & 
in  the  direction  of  the  z-axis.  The  extra  energy  due  to  this  magnetic 
field  will  consist  of  a term 

eJ^j2mc . ( mt+hat ), 


(41) 
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like  the  last  term  in  equation  (89)  of  § 41,  contributed  by  each 
electron,  and  will  thus  be  altogether  . 

2 (mzJr^az)  = eJ^/2mc.(le+28z)  — e*#/2mc.(j2+s2).  (^2) 
This  is  our  perturbing  energy  V.  We  shall  now  use  the  method  of 
§ 43  to  determine  the  changes  in  the  energy -levels  caused  by  this  V . 
The  method  will  be  legitimate  only  provided  the  field  is  so  weak  that 
V is  small  compared  with  the  energy  differences  within  a multiplet. 

Our  unperturbed  system  is  degenerate,  on  account  of  the  direction 
of  the  vector  j being  undetermined.  We  must  therefore  take,  from 
the  representative  of  V in  a Heisenberg  representation  for  the  un- 
perturbed system,  those  matrix  elements  that  refer  to  one  particular 
energy -level  for  their  row  and  column,  and  obtain  the  eigenvalues  of 
the  matrix  thus  formed.  We  can  do  this  best  by  first  splitting  up  V 
into  two  parts,  one  of  which  is  a constant  of  the  unperturbed  motion, 
so  that  its  representative  contains  only  matrix  elements  referring  to 
the  same  unperturbed  energy-level  for  their  row  and  column,  while 
the  representative  of  the  other  contains  only  matrix  elements  refer- 
ring to  two  different  unperturbed  energy-levels  for  their  row  and 
column,  so  that  this  second  part  does  not  affect  the  first-order  per- 
turbation. The  term  involving  jz  in  (42)  is  a constant  of  the  un- 
perturbed motion  and  thus  belongs  entirely  to  the  first  part.  For  the 
term  involving  sz  we  have 

Szdl+H+jl)  = Msjx+sjv+sjt)+(3zjx-jzsx)jx+(3jv-jzsu)jv 

or 


Yx  — 8z3y  3z8v  szK~ 


~hsv  — h lSz~ 


Yu  = jz8x~8z3x  — lz8x~8zlx  = lzsx-lxsz.  I 

The  first  term  in  this  expression  for  sz  is  a constant  of  the  unperturbed 
motion  and  thus  belongs  entirely  to  the  first  part,  while  the  second 
term,  as  we  shall  now  see,  belongs  entirely  to  the  second  part. 
Corresponding  to  (44i  we  can  introduce 

y*  — ^x8y  ly8x- 

It  can  now  easily  be  verified  that 

3xYx^~3u  Yv^izYz  — 0 

and  from  (30)  of  § 35 

[jz,Yx]  = Yv>  [i*.y»]  = -Yx>  [ 3z>Yz ] = 
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These  relations  connecting  jx,jy.jz  and  yx,  yy,  y,  are  of  the  same  form 
as  the  relations  connecting  mx,  m.y,  m,  and  x,  y.  z in  the  calculation 
in  § 40  of  the  selection  rule  for  the  matrix  elements  of  2 in  a repre- 
sentation with  k diagonal.  From  the  result  there  obtained  that  all 
matrix  elements  of  2 vanish  except  those  referring  to  two  k values 
differing  by  we  can  infer  that  all  matrix  elements  of  yc,  and 
similarly  of  yx  and  yu,  in  a representation  with  j diagonal,  vanish 
except  those  referring  to  two  j values  differing  by  ±fi.  The  coeffi- 
cients of  yx  and  yv  in  the  second  term  on  the  right-hand  side  of  (43) 
commute  with  j,  so  the  representative  of  the  whole  of  this  term  will 
contain  only  matrix  elements  referring  to  two  j values  differing  by 
±^>  and  thus  referring  to  two  different  energy-levels  of  the  unper- 
turbed system. 

Hence  the  perturbing  energy  V becomes,  when  we  neglect  that 
part  of  it  whose  representative  consists  of  matrix  elements  referring 
to  two  different  unperturbed  energy-levels, 


• r, , wiHji+wi 

2mcJz\  "r  2 j(j+h)  T 


The  eigenvalues  of  this  give  the  first-order  changes  in  the  energy- 
levels.  We  can  make  the  representative  of  this  expression  diagonal 
by  choosing  our  representation  such  that  jz  is  diagonal,  and  it  then 
gives  us  directly  the  first-order  changes  in  the  energy -levels  caused  by 
the  magnetic  field.  This  expression  is  known  as  Lande’s  formula. 

The  result  (45)  holds  only  provided  the  perturbing  energy  V is  small 
compared  with  the  energy  differences  within  a multiplet.  For  larger 
values  of  V a more  complicated  theory  is  required.  For  very  strong 
fields,  however,  for  which  V is  large  compared  with  the  energy  differ- 
ences within  a multiplet,  the  theory  is  again  very  simple.  We  may 
now  neglect  altogether  the  energy  of  the  spin  magnetic  moments  for 
the  atom  with  no  external  field,  so  that  for  our  unperturbed  system 
the  vectors  1 and  s themselves  are  constants  of  the  motion,  and  not 
merely  their  magnitudes  l and  s.  Our  perturbing  energy  V,  which  is 
still  eJtj2mc.(jz-\-sz),  is  now  a constant  of  the  motion  for  the  unper- 
turbed system,  so  that  its  eigenvalues  give  directly  the  changes  in  the 
energy-levels.  These  eigenvalues  are  integral  or  half-odd  integral 
multiples  of  eJthj'lmc  according  to  whether  the  number  of  electrons 
in  the  atom  is  even  or  odd. 
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48.  General  remarks 

In  this  chapter  we  shall  investigate  problems  connected  with  a par- 
ticle which,  coming  from  infinity,  encounters  or  ‘collides  with’  some 
atomic  system  and,  after  being  scattered  through  a certain  angle,  goes 
off  to  infinity  again.  The  atomic  system  which  does  the  scattering 
we  shall  call,  for  brevity,  the  scatterer.  We  thus  have  a dynamical 
system  composed  of  an  incident  particle  and  a scatterer  interacting 
with  each  other,  which  we  must  deal  with  according  to  the  laws  of 
quantum  mechanics,  and  for  which  we  must,  in  particular,  calculate 
the  probability  of  scattering  through  any  given- angle.  The  scatterer 
is  usually  assumed  to  be  of  infinite  mass  and  to  be  at  rest  throughout 
the  scattering  process.  The  problem  was  first  solved  by  Born  by  a 
method  substantially  equivalent  to  that  of  the  next  section.  We  must 
take  into  account  the  possibility  that  the  scatterer,  considered  as  a 
system  bv  itself,  may  have  a number  of  different  stationary  states 
and  that  if  it  is  initially  in  one  of  these  states  when  the  particle  arrives 
from  infinity,  it  may  be  left  in  a different  one  when  the  particle  goes 
off  to  infinity  again.  The  colliding  particle  may  thus  induce  transi- 
tions in  the  scatterer. 

The  Hamiltonian  for  the  whole  system  of  scatterer  plus  particle 
will  not  involve  the  time  explicitly,  so  that  this  whole  system  will 
have  stationary  states  represented  by  periodic  solutions  of  Schro- 
dinger’s  wave  equation.  The  meaning  of  these  stationary  states 
requires  a little  care  to  be  properly  understood.  It  is  evident  that 
for  any  state  of  motion  of  the  system  the  particle  will  spend  nearly  all 
its  time  at  infinity,  so  that  the  time  average  of  the  probability  of  the 
particle  being  in  any  finite  volume  will  be  zero.  Now  for  a stationary 
state  the  probability  of  the  particle  being  in  a given  finite  volume, 
like  any  other  result  of  observation,  must  be  independent  of  the  time, 
and  hence  this  probability  will  equal  its  time  average,  which  we  have 
seen  is  zero.  Thus  only  the  relative  probabilities  of  the  particle  being 
in  different  finite  volumes  will  be  physically  significant,  their  absolute 

values  being  all  zero.  The  total  energy  of  the  system  has  a continuous  ( 

range  of  eigenvalues,  since  the  initial  energy  of  the  particle  can  be 
anything.  Thus  a ket,  ].s>  say,  corresponding  to  a stationary  state, 
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being  an  eigenket  of  the  total  energy,  must  be  of  infinite  length.  We 
can  see  a physical  reason  for  this,  since  if  |s>  were  normalized  and  if 
Q denotes  that  observable — a certain  function  of  the  position  of 
the  particle — that  is  equal  to  unity  if  the  particle  is  in  a given  finite 
volume  and  zero  otherwise,  then  <s|@|s>  would  be  zero,  meaning  that 
the  average  value  of  Q,  i.e.  the  probability  of  the  particle  being  in  the 
given  volume,  is  zero.  Such  a ket  |,s>  would  not  be  a convenient  one 
to  work  with.  However,  with  j.s>  of  infinite  length,  <s|Q|s>  can  be 
finite  and  would  then  give  the  relative  probability  of  the  particle 
being  in  the  given  volume. 

In  picturing  a state  of  a system  corresponding  to  a ket  \x)  which 
is  not  normalized,  but  for  which  < x\x > = n say,  it  may  be  convenient 
to  suppose  that  we  have  n similar  systems  all  occupying  the  same 
space  but  with  no  interaction  between  them,  so  that  each  one  follows 
out  its  own  motion  independently  of  the  others,  as  we  had  in  the 
theory  of  the  Gibbs  ensemble  in  § 33.  We  can  then  interpret  <a;|a|a:>, 
where  a is  any  observable,  directly  as  the  total  a for  all  the  n systems. 
In  applying  these  ideas  to  the  above-mentioned  |s>  of  infinite  length, 
corresponding  to  a stationary  state  of  the  system  of  scatterer  plus 
colliding  particle,  we  should  picture  an  infinite  number  of  such  sys- 
tems with  the  scatterers  all  located  at  the  same  point  and  the  particles 
distributed  continuously  throughout  space.  The  number  of  particles 
in  a given  finite  volume  would  be  pictured  as  <s|<?|8>,  Q being  the 
observable  defined  above,  which  has  the  value  unity  when  the  particle 
is  in  the  given  volume  and  zero  otherwise.  If  the  ket  is  represented 
by  a Schrodinger  wave  function  involving  the  Cartesian  coordinates 
of  the  particle,  then  the  square  of  the  modulus  of  the  wave  function 
could  be  interpreted  directly  as  the  density  of  particles  in  the  picture. 
One  must  remember,  however,  that  each  of  these  particles  has  its  own 
individual  scatterer.  Different  particles  may  belong  to  scatterers  in 
different  states.  There  will  thus  be  one  particle  density  for  each  state 
of  the  scatterer,  namely  the  density  of  those  particles  belonging  to 
scatterers  in  that  state.  This  is  taken  account  of  by  the  wave  function 
involving  variables  describing  the  state  of  the  scatterer  in  addition 
to  those  describing  the  position  of  the  particle. 

For  determining  scattering  coefficients  we  have  to  investigate 
stationary  states  of  the  whole  system  of  scatterer  plus  particle.  For 
instance,  if  we  want  to  determine  the  probability  of  scattering  in 
various  directions  when  the  scatterer  is  initially  in  a given  stationary 
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state  and  the  incident  particle  has  initially  a given  velocity  in  a given 
direction,  we  must  investigate  that  stationary  st^te  of  the  whole 
system  whose  picture,  according  to  the  above  method,  contains  at 
great  distances  from  the  point  of  location  of  the  scatterers  o y 
particles  moving  with  the  given  initial  velocity  and  dnecUon . and 
Longing  each  to  a scatterer  in  the  given  initial  stationary  state 
together^ with  particles  moving  outward  from  the  point  of  location 
of* the  scatter!  and  belonging  possibly  to  scatterers  m vanou 
stationary  states.  This  picture  corresponds  closely  to  the  actual  sta 
of  affairs  in  an  experimental  determination  of  scattering  coefficient 
with  the  difference  that  the  picture  really  describes  only  one  actual 
system  of  scatterer  plus  particle.  The, distribution 
particles  at  infinity  in  the  picture  gives  us  immediately  all  the  infor- 
mation about  scattering  coefficients  that  could  be  obtained  by  exper^ 
ment.  For  practical  calculations  about  the  stationary  state  Ascribed 
by  this  picture  one  may  use  a perturbation  method  somewhat  like 
that  of  § 43,  taking  as  unperturbed  system,  for  example,  that  for 
which  there  is  no  interaction  between  the  scatterer  and  particle 
In  dealing  with -collision  problems,  a further  possibility  to  be  taken 
into  consideration  is  that  the  scatterer  may  perhaps  be  capable  of 
.Crbipg  re-emitting  the  particle.  This  pc  A.hty  -hen 
there  eSts  one  or  more  states  of  absorption  of  the  whole  system 
state  of  absorption  being  an  approximately  stationary  state  wh  o 
TcT.ll  in  the  sense  mentioned  at  the  end  of  5 38  (i.e  for  which 
the  probability  of  the  particle  being  at  a greater  distance  than  r fro 
the  scatterer  tends  to  zero  asr-^oo).  Since  a state  of  absorption  is 
only  approximately  stationary,  its  property  of  being  closed  will  e 
only  a transient  one,  and  after  a sufficient  lapse  of  time  there  wil  b 
a finite  probability  of  the  particle  being  on  its  way  to  infinity. 
Physically  this  means  there  is  a finite  probability  of  spontaneous 
emission  of  the  particle.  The  fact  that  we  had  to  use  the  word 
■approximately’  in  stating  the  conditions  required  for  the  phenomena 
of  emission  and  absorption  to  be  able  to  occur  shows  that  these oondi- 
tions  are  not  expressible  in  exact  mathematical  language.  Onecang 
aTeaning  to  these  phenomena  only  with  reference  to  a per  urbatmn 
method.11  They  occur  when  the  unperturbed  system  (of  scatterer  plus 
particle)  has  stationary  states  that  are  closed.  The  introduction  of  the, 
perturbation  spoils  the  stationary  property  of  these  states  and  gives 
rise  to  spontaneous  emission  and  its  converse  absorption. 
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For  calculating  absorption  and  emission  probabilities  it  is  necessary 
to  deal  with  non- stationary  states  of  the  system,  in  contradistinction 
to  the  ease  for  scattering  coefficients,  so  that  the  perturbation  method 
of  § 44  must  be  used.  Thus  for  calculating  an  emission  coefficient 
we  must  consider  the  non-stationary  states  of  absorption  described 
above.  Again,  since  an  absorption  is  always  followed  by  a re-emission, 
it  cannot  be  distinguished  from  a scattering  in  any  experiment  in- 
volving a steady  state  of  affairs,  corresponding  to  a stationary  state 
of  the  system.  The  distinction  can  be  made  only  by  reference  to  a 
non-steady  state  of  affairs,  e.g.  by  use  of  a stream  of  incident  particles 
that  has  a sharp  beginning,  so  that  the  scattered  particles  will  appear 
immediately  after  the  incident  particles  meet  the  scatterers,  while 
those  that  have  been  absorbed  and  re-emitted  will  begin  to  appear 
only  some  time  later.  This  stream  of  particles  would  be  the  picture 
of  a certain  ket  of  infinite  length,  which  could  be  used  for  calculating 
the  absorption  coefficient. 


49.  The  scattering  coefficient 

We  shall  now  consider  the  calculation  of  scattering  coefficients, 
taking  first  the  case  when  there  is  no  absorption  and  emission,  which 
means  that  our  unperturbed  system  has  no  closed  stationary  states. 
We  may  conveniently  take  this  unperturbed  system  to  be  that  for 
which  there  is  no  interaction  between  the  scatterer  and  particle.  Its 
Hamiltonian  will  thus  be  of  the  form 

E = HS+W,  (1) 

where  Hs  is  that  for  the  scatterer  alone  and  W that  for  the  particle 
alone,  namely,  with  neglect  of  relativistic  mechanics, 

W = l/2m.(pl+pl+p*).  (2) 

The  perturbing  energy  V,  assumed  small,  will  now  be  a function  of 
the  Cartesian  coordinates  of  the  particle  z,  y,  z,  and  also,  perhaps, 
of  its  momenta  px,  py,  pz,  together  with  dynamical  variables  describ- 
ing the  scatterer. 

Since  we  are  now  interested  only  in  stationary  states  of  the  whole 
system,  we  use  a perturbation  method  like  that  of  § 43.  Our  unper- 
turbed system  now  necessarily  has  a continuous  range  of  energy- 
levels,  since  it  contains  a free  particle,  and  this  gives  rise  to  certain 
modifications  in  the  perturbation  method.  The  question  of  the  change 
in  the  energy -levels  caused  by  the  perturbation,  which  was  the  main 
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question  of  § 43,  no  longer  has  a meaning,  and  the  convention  in  § 43 
ofulg  the  same  number  of  primes  to  denote 
values  of  E and  H now  drops  out.  Again,  the  splitting  of  ene  gy 
levels  which  we  had  in  § 43  when  the  unperturbed  system  is  degenerate 
cannot  now  arise,  since  if  the  unperturbed  system  ,s  degenerate _«  e 
perturbed  one,  which  must  also  have  a continuous  range  of  energy 
levels  will  also  be  degenerate  to  exactly  the  same  exte  • 

We  again  use  the  general  scheme  of  equations  developed  at  the 
beginning  of  § 43,  equations  (1)  to  (4)  there,  but  we  now  take  our 
unperturbed  tationary  state  forming  the  eero-order  approximation 
“Cong  to  an  energy-level  E‘  just  equal  to  «- 
our  perturbed  stationary  state.  Thus  the  « is  introduced  m I the  secon 
of  equation,  (3)  § 43  are  now  all  aero  and  the  second  of  equal, o 

(4)  there  now  reads  (E’—E)\l>  = FiO>-  ^ 

Similarly,  the  third  of  equations  (4)  § 43  now  reads 

(E'-E)  |2>  ==  F|l>.  ( ' 

We  shall  proceed  to  solve  equation  (3)  and  to  obtain  the _ ®ca^ten  ° 
coefficient  to  the  first  order.  We  shall  need  equiat.or.  W ■ § » • 

Let  a denote  a complete  set  of  commuting  obaorva-Mes  dcscnb. 
the  scatterer,  which  are  constants  of  the  motion  when  the  scatterer  is 
alone  and  may  thus  be  used  for  labelling  the  stationary  states  of  the 
scatterer.  Thi  requires  that  Hs  shall  commute  ^ 
a function  of  them.  We  can  now  take  a representation  of  the  who 
"which  the  o's  and  a,  y,  the  coordinates  of  the  particle 
are  diagonal.  This  will  make  H,  diagonal.  Let  |0>  be  represen  e y 
fx<x'|0>  and  |1>  by  <xa'll>,  the  single  variable  x being  written  to 
denoted,  y,  z and  the  prime  being  omitted  from  x lor  brevity  Alac 
the  single  differential  dH  will  be  written  to  denote  the  product  dxdycL. 
Equation  (3),  written  in  terms  of  representatives,  becomes,  with  the 

help  of  (1)  and  (2), 

|1>  = 2 f <X0t'lFixV>dV'<x  * |0>. 

“ (5) 

Sunnose  that  the  incident  particle  has  the  momentum  P°  and  that 

the  initial  stationary  state  of  the  scattere.  is  «•.  “ 

of  our  unperturbed  system  is  now  the  one  for  which  p P 
„ _ ao  and  hence  its  representative  is 

<x«'|0>  = Sa-,T«  c,lp°xVfi  (6) 
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This  makes  equation  (5)  reduce  to 

{£' - tfs(a')  -M2/2ra • V2}< Xa'Jl)  = J {xa' | V | x°a°> d3x° e’V.**)/* 

or  (fc*-f  V2)<Xa'|l>  = F,  (7) 

where  k*  = 2mh~*{E' — H,(at')}  (8) 

and  F ==  2mft~2  J <x«' \V\ x°a°)  dsx° e^^h,  (9) 

a definite  function  of  x,  y,  z,  and  <*'.  We  must  also  have 


E'  — Ha(ofi)+p^J2m.  (10) 

Our  problem  now  is  to  obtain  a solution  <xa'|l>  of  (7)  which,  for 
values  of  x,  y,  z denoting  points  far  from  the  scatterer,  represents 
only  outward  moving  particles.  The  square  of  its  modulus,  | <x«'  1 1 > |*, 
will  then  give  the  density  of  scattered  particles  belonging  to  scatterers 
in  the  state  ol  when  the  density  of  the  incident  particles  is  |<Xa°|0>|*, 
which  is  unity.  If  we  transform  to  polar  coordinates  r,  6,  <f>,  equation 
(7)  becomes 


T0r*^r  9r 


,~Bind-d 


r* sin0  d$  90  r*sin*0  d$ 


^2j<r0^'|l>  = F.  (11) 


Now  F must  tend  to  zero  as  r ->oo,  on  account  of  the  physical  re- 
quirement that  the  interaction  energy  between  the  scatterer  and 
particle  must  tend  to  zero  as  the  distance  between  them  tends  to 
infinity.  If  we  neglect  F in  (11)  altogether,  an  approximate  solution 


for  large  r is 


(12) 


where  u is  an  arbitrary  function  of  0,  </>,  and  <*',  since  this  expression 
substituted  in  the  left-hand  side  of  (11)  gives  a result  of  order  r~*. 
When  we  do  not  neglect  F,  the  solution  of  (11)  will  still  be  of  the 
form  (12)  for  large  r,  provided  F tends  to  zero  sufficiently  rapidly  as 
r °°,  but  the  function  u will  now  be  definite  and  determined  by  the 
solution  for  smaller  values  of  r. 

For  values  a'  of  the  a’s  such  that  k*,  defined  by  (8),  is  positive,  the 
k in  (12)  must  be  chosen  to  be  the- positive  square  root  of  jfc*,  in  order 
that  (12)  may  represent  only  outward  moving  particles,  i.e.  particles 
for  which  the  radial  component  of  momentum,  which  from  § 38 
equals  pr— ifo-1  or  —ih(djdr -f-r-1),  has  a positive  value.  We  now 
have  that  the  density  of  scattered  particles  belonging  to  scatterers  in 
state  equal  to  the  square  of  the  modulus  of  (12),  falls  off  with 
increasing  r according  to  the  inverse  square  law,  as  is  physically 
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necessary  and  their  angular  distribution  is  given  by 
Further,  the  magnitude,  F say,  of  the 

particles  must  equal  kh,  the  momentum  being  radial  for  larg  , 
so  that  their  energy  is  equal  to 


= v-hjj)  = hm °)-^(“')+k’ 

2m  2m  . . 

with  the  help  of  (8)  and  (10).  This  is  just  the  energy 
particle,  namely  p-/J»,  reduced  by  the  mcreaee > m _ energy  of 
Ltterer,  namely  ™ agreement  with  the  tow ^of co^ 

servation  of  energy.  For  values  a of  the  ot  s such  tha  g 

there  are  no  scattered  particles,  the  total  initial  energy  being  insuffi- 
cient for  the  scatterer  to  be  left  in  the  state  « • 

We  must  now  evaluate  *(«*«')  for  a set  of  values  a for  the  a s such 
that  t*  is  positive,  and  obtain  the  angular  distribution  of  the  scattered 
particles  belonging  to  scatterers  in  state  It  is  sufficient  to  evaluate 
l for  the  direction  9 = 0 of  the  pole  of  the  polar  coordinates,  since 

this  direction  is  arbitrary.  We  make  use  of  Green  . ^ 

states  that  for  any  two  functions  of  position  A and  B the  volu 
. , , r / A V2R— BVI 2A)  d3x  taken  over  any  volume  equals  the 

‘"Z  .LZ  l ZbiLbULM*  *^cn  over  the  boundary 
ofte  vlme,  l/L  denoting  differentiation  along  the  normal  to 

the  surface.  We  take 

A = e~ikrcoa9,  B = (rO<fnx  |l) 

and  apply  the  theorem  to  a large  sphere  with  the  origin  a.  centre. 
The  volume  integrand  is  thus 

g— tier  cos  e V2<r0</>ct'  1 1 > — ir9<j)(x'  1 1 > V2e  -,fcr  cos  9 

^ e-ito-cose(V2+fc2)<r^«'|l>  ==  e-ikrcoseF 
from  (7)  or  (11),  while  the  surface  integrand  is,  with  the  help  of  (12), 

g-ikr cos 6 (r9(f>a  (l)  — <»#«'  \ 1>  — e~lkrcoa 9 
dr 


%02 


e-ifcrcos0w 


I eikr  Jrivl  eikrk  cos  6 e-ikr  008  9 

1 r 


= ikur-Hl+cos9)eW-°°a9> 
with  neglect  of  r~2.  Hence  we  get 


f c_ifcrcos eF  d*x  = 2[  dt  J r2  sin 9 d9 . ikur~x{\ + cos  9)eikril 

J ft  0 


-cos  0) 
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the  volume  integral  on  the  left  being  taken  over  the  whole  of  space. 
The  right-hand  side  becomes,  on  being  integrated  by  parts  with 
respect  to  6,  * 

2 IT  „ 

J <14,  j[w(l  +COS 0)e«'*'U-cos0)j®-”n' — J e»Ml-coS0)^[^1+cos0)]  dg\ 

0 0 

The  second  term  in  the  {}  brackets  is  of  the  order  of  magnitude  of 
r~l,  as  would  be  revealed  by  further  partial  integrations,  and  may 
therefore  be  neglected.  We  are  thus  left  with 

2-t 7 

f e-ik"**9F  tPx  ---  -2  j d<f>  u( OfT)  ---=  -4ttw(0^'), 

0 

giving  the  value  of  u(8<f>cx')  for  the  direction  9—0. 

This  result  may  be  written 

u(0<f>a.')  = — (4-7t)~1  f (i-  iP'-fusSlh J?  dax,  (13) 

since  P'  ~ kh.  If  the  vector  p'  denotes  the  momentum  of  the  scattered 
electrons  coming  off  in  a certain  direction  (and  is  thus  of  magnitude 
P'),  the  value  of  u for  this  direction  will  be 

u(e'4>'a')  -=  -(4tt)  1 (' f d\r, 

as  follows  from  (13)  if  one  takes  this  direction  to  be  the  pole  of  the 
polar  coordinates.  This  becomes,  with  the  help  of  (9), 

u(0'4>\x')  — — (27 T)~hnh  ~2  j j g-«p'-*V*  rP.t:  (xa'T'ixV) 

--  — 2Wt<  pVjl-'jpV),  (14) 

when  one  makes  a transformation  from  the  coordinates  x to  the 
momenta  p of  the  particle,  using  the  transformation  function  (54) 
of  § 23.  The  single  letter  p is  here  used  as  a label  for  the  three 
components  of  momentum. 

The  density  of  scattered  particles  belonging  to  scatterers  in  state 
a'  is  now  given  by  |7/(0'^'a')|2/r2.  Since  their  velocity  is  P'/m,  the 
rate  at  which  these  particles  appear  per  unit  solid  angle  about  the 
direction  of  the  vector  p'  will  be  P'/m.\u(9’<f>'cc')\2.  The  density  of 
the  incident  particles  is,  as  we  have  seen,  unity,  so  that  the  number 
of  incident  particles  crossing  unit  area  per  unit  time  is  equal  to  their 
velocity  P°jm,  where  P°  is  the  magnitude  of  p°.  Hence  the  effective 
area  that  must  be  hit  by  an  incident  particle  in  order  to  be  scattered 
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in  a unit  solid  angle  about  the  direction  p'  and  then  belong  to  a 
scatterer  in  state  ex'  will  be 

P'IP°.  |M(0'f  <*')l8  = ATThnWP'IP".  |<pV|F|p0a°>|2.  (15) 

This  is  the  scattering  coefficient  for  transitions  of  the  scatterer. 

It  depends  on  that  matrix  element  <p'a'iF|pH<*0>  of  the  perturbing 
energy  V whose  column  pV  and  whose  row  pV  refer  respectively  to 
the  initial  and  final  states  of  the  unperturbed  system,  between  which 
the  scattering  transition  process  takes  place.  The  result  (15)  is  thus 
in  some  ways  analogous  to  the  result  (24)  of  § 44,  although  the 
numerical  coefficients  are  different  in  the  two  cases,  corresponding 
to  the  different  natures  of  the  two  transition  processes. 

50.  Solution  with  the  momentum  represenfation 

The  result  (15)  for  the  scattering  coefficient  makes  a reference  only 
to  that  representation  in  which  the  momentum  p is  diagonal.  One 
would  thus  expect  to  be  able  to  get  a more  direct  proof  of  the  result 
bv  working  all  the  time  in  the  p -representation,  instead  of  working 
in  the  x-representation  and  transforming  at  the  end  to  the  p-repre- 
sentation,  as  was  done  in  § 49.  This  would  not  at  first  sight  appear 
to  be  a great  improvement,  as  the  lack  of  directness  of  the  x-repre- 
sentation method  is  offset  by  more  direct  applicability,  it  being 
possible  to  picture  the  square  of  the  modulus  of  the  ^ -representative 
of  a state  as  the  density  of  a stream  of  particles  in  process  of  be  e 
scattered.  The  x-representation  method  has,  however.  otffiT  more 
serious  disadvantages.  One  of  the  main  applications  of  the  theory 
of  collisions  is  to  the  case  of  photons  as  incident  particles.  Now 
photon  is  not  a simple  particle  but  has  a polarization.  It  is  eviden 
from  classical  electromagnetic  theory  that  a photon  with  a defimte 
momentum,  i.e.  one  moving  in  a definite  direction  with  a definite 
frequency,  may  have  a definite  state  of  polarization  (linear,  circular 
etc  ) while  a photon  with  a definite  position,  which  is  to  be  pictured 
as  an  electromagnetic  disturbance  confined  to  a very  small  volum  , 
cannot  have  any  definite  polarization.  These  facts  mean  that  the 
polarization  observable  of  a photon  commutes  with  toimato 
but  not  with  its  position.  This  results  in  the  p -representation  method 
being  immediately  applicable  to  the  case  of  photons,  it  being  oi  y 
necessary  to  introduce  the  polarizing  variable  into  the  representatives 
and  treat  it  along  with  the  <*’s  describing  the  scatterer,  whi  e e 
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x-representation  method  is  not  applicable.  Further,  in  dealing  with 
photons,  it  is  necessary  to  take  relativistic  mechanics  into  account. 
This  can  easily'be  done  in  the  p -representation  method,  but  not  so 
easily  in  the  x-representation  method. 

Equation  (3)  still  holds  with  relativistic  mechanics,  but  W is  now 
6ivenby  WVc,  _ (l6) 

instead  of  by  (2).  Written  in  terms  of  p -representatives,  equation  (3) 

g eS  {E'--Hs(a)—  W}<pa'|l>  = <pa'|F|0>, 

p being  written  instead  of  p for  brevity  and  W being  understood  as 
a definite  function  of  px,  py,  Pz  given  by  (16).  This  may  be  written 

(IF— IT)<pc/|l>  = <pa'|FjO>,  (17) 

Wllere  W'  = E'-HSW)  (18) 

and  is  the  energy  required  by  the  law  of  conservation  of  energy  for 
a scattered  particle  belonging  to  a scatterer  in  state  The  ket  |0> 
is  represented  by  (6)  in  the  x-representation  and  the  basic  ket  |pV> 
is  represented  by  7 

<xa'|p»a0>  = 8a'a«<x|p°>  = 

from  the  transformation  function  (54)  of  § 23.  Hence 

|0>  = Aj|p°a°>, 

and  equation  (17)  may  be  written 

(IF-If)<pa'|l>  — ^l<pa'|F|p°a0>. 

We  now  make  a transformation  from  the  Cartesian  coordinates 
Px<  Pv>  Pz  °f  P to  its  polar  coordinates  P,  w,  given  by 

px  = P costa,  pv  = P sin  a,  cosy,  pz  = Psintosiny. 

If  in  the  new  representation  we  take  the  weight  function  P’sino, 
t en  the  weight  attached  to  any  volume  of  p-space  will  be  the  same 
as  in  the  previous  p-representation,  so  that  the  transformation  will 
mean  simply  a relabelling  of  the  rows  and  columns  of  the  matrices 
without  any  alteration  of  the  matrix  elements.  Thus  (20)  will  become 
in  the  new  representation 

(IF-  W)<Ptoya'|l>  = A^Poiya'  | V | P0ai0y°a0>,  (21) 

If  being  now  a function  of  the  single  variable  P. 


(19) 


(20) 


| 50  SOLUTION  WITH  MOMENTUM  REPRESENTATION  195 

The  coefficient  of  <Po>xa'|l>,  namely  W'-W,  is  now  simply  a 
multiplying  factor  and  not  a differential  operator  as  it  was  with  the 
x-representation  method.  We  can  therefore  divide  cut  by  this  factor 
and  obtain  an  explicit  expression  for  <Ptoxa'l1>-  When>  however,  ot 
is  such  that  W',  defined  by  (18),  is  greater  than  mc\  this  factor  will 
have  the  value  zero  for  a certain  point  in  the  domain  of  the  variable 
P,  namely  the  point  P = P',  given  in  terms  of  W'  by  (16).  The 
function  <Po)x«'|l>  will  then  have  a singularity  at  this  point.  This 
singularity  shows  that  <Po>x«'|l>  represents  an  infinite  number  of 
particles  moving  about  at  great  distances  from  the  scatterers  with 
energies  indefinitely  close  to  W'  and  it  is  therefore  this  singularity 
that  we  have  to  study  to  get  the  angular  distribution  of  the  particles 

at  infinity. 

The  result  of  dividing  out  (21)  by  the  factor  W'—W  is,  according 
to  (13)  of  §16, 

<PwX«'|l>  = A^Paix*'  I V I P°w0x°a0)/(  W'—W)  + A(a>xa0  8(  W'  — IF), 

(22) 

where  A is  an  arbitrary  function  of  (d,  x,  and  a'.  To  give  a meaning 
to  the  first  term  on  the  right-hand  side  of  (22),  we  make  the  conven- 
tion that  its  integral  with  respect  to  P over  a range  that  includes  the 
value  P'  is  the  limit  when  « ->  0 of  the  integral  when  the  small 
domain  P'—e  to  P'+e  is  excluded  from  the  range  of  integration. 
This  is  sufficient  to  make  the  meaning  of  (22)  precise,  since  we  are 
interested  effectively  only  in  the  integrals  of  the  representatives  of 
states  when  the  representation  has  continuous  ranges  of  rows  and 
columns.  We  see  that  equation  (21)  is  inadequate  to  determine  the 
representative  <Pa>x«' ID  completely,  on  account  of  the  arbitrary 
function  A occurring  in  (22).  We  must  choose  this  A such  that 
<Po>xa'  |1>  represents  only  outward  moving  particles,  since  we  want 
the  only  inward  moving  particles  to  be  those  corresponding  to  |0> 
Let  us  take  first  the  general  case  when  the  representative  <P^Xl> 
of  a state  of  the  particle  satisfies  an  equation  of  the  type 

(W' — W)<Pwxl>  —f(Pwx)>  (23) 

where  /(Po>x)  » «*y  faction  of  P,  o»,  and  x.  and  w'  18  a number 
greater  than  me1,  so  that  <P«xl>  “ of  the  form 

<Pa)X|>  = f(Pu>X)i(W'-  W)+A(<vx)  8(JT'— JP),  (24) 

and  let  us  determine  now  what  A must  be  in  order  that  <Po>xl>  may 
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represent  only  outward  moving  particles.  We  can  do  this  by  trans- 
forming <PoiX[>  to  the  x-representation  or  rather  the  (^-repre- 
sentation, and*  comparing  it  with  (12)  for  large  values  of  r The 
transformation  function  is 

<XH\Pu>X>  — h~le^  P'*>lh  = ~ 1 gi/Moos  a.  cos  0+  sin  ui  sin  0 cosfy 

For  the  direction  9 = 0 we  find 


<rO^|>  = h~ * j P2  dP  J dx  j sincu  dw  eiPrcos^( PwX\} 
o oo 

°°  2 n 

= *-«/r^rJ,x(_[~<P„x,>r+ 

0 o Jtu=0 


ri 

f eiPrcos«o/A  q i 

+ J ‘‘"-.’Hr  • 


iPrjh 

The  second  term  in  the  { } brackets  is  of  order  r-*,  as  may  be  verified 
by  fuither  partial  integrations  with  respect  to  o,  and  can  therefore 
be  neglected.  We  are  left  with 


f in 

<r0^>  = PdP  j dx{e~^<P„x \>-e<Pr»<POxl>} 

co 

= lh  ir~X  / P dP {e~iPT,h<PnX | > - eiPrl»<POx\)}.  (25) 

When  we  substitute  for  <P<aX| > its  value  given  by  (24),  the  first 
term  m the  integrand  m (25)  gives 

00 

ih~,r''S  p dp  °-F,'V(P”x)/0r-  wnM»x)tx  w'-wfl.  (26) 

The  term  involving  S (W-W)  here  may  be  integrated  immediately 

mSn 01,6  — the  re,“ion  -u* 

00 

ih-k-b-1  j W dW  e^iFr!hX(-nX)b(  W’  — IT) 

me3 

= ih-h-tr-1  W'A(7rX)e~ip'rl/i.  (27) 

To  integrate  the  other  term  in  (26)  we  use  the  formula 


fgiP)£^pdP=!l(nf 


(28) 
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with  neglect  of  terms  involving  r-1,  for  any  continuous  function  g(P), 
which  formula  holds  since  f K(P)e-™  dP  is  of  order  r-»  for  any 

continuous  function  K(P)  and  since  the  difference 
g(P)l(P'-P)-9(P')l(P'-p) 

i,  continuous.  The  right-hand  side  of  (28)  when  evaluated  with 
neglect  of  term.  Involving  r>,  and  also  with  neglect  of  the  small 
domain  P' — < to  P’+e  in  the  domain  of  integration,  gives 


v — cO 


dP 


_ j^=SrlHp  -'■wne-1'*- 


(29) 


In  our  present  example  g{P)  is 

g(P)  = ih-ir-'P  f(Pnx)(p'  ~~  P)KW'—W), 
which  has  the  limiting  value  when  P = P'> 

g(n  = ih-b-lP'f{P'*x) W'lP'c*  = ih-ic-*r~'W'f(P' nX). 

Substituting  this  in  (29)  and  adding  on  the  expression  (27),  we  obtain 
the  following  value  for  the  integral  (26) 

h-*c-*r-'W'{— ir/(P'wx)+iA(wx)}e-‘p',,ft  (30) 

Similarly  the  second  term  in  the  integrand  in  (25)  gives 

h-ic-*r-1W'{-irf(P'Ox)-iM(>x)yP'rl *•  (31) 

The  sum  of  these  two  expressions  is  the  value  of  (rQ<f>\>  when  r is 

lftWe  require  that  <r0*|>  shall  represent  only  outward 
particles,  and  hence  it  must  be  of  the  form  of  a multiple  of  e . 
Thus  (30)  must  vanish,  so  that 

A(7tX)  = -tw/lP'ffx).  (32) 

We  see  in  this  way  that  the  condition  that  «#1>  shall  represent 
mdyTu.ward  moving  particles  in  the  direction . # - 0 ■ thejalu 

of  A for  the  opposite  direction  0 = w.  Smce  the  direction  0 - J or 
c = 0 of  the  pole  of  our  polar  coordinates  is  not  in  any  way  sing 

we  can  generalize  (32)  to 

A(«x)  = —™f(P'wx)>  '* 
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which  gives  the  value  of  A for  an  arbitrary  direction.  This  value 
substituted  in  (24)  gives  a result  that  may  be  written 

(PcoX\y  =f(Pu>x){ll(W'-W)-inS(W'-W)},  (34) 

since  one  can  substitute  P'  for  P in  the  coefficient  of  a term  involving 
8(11"—  IF)  as  a factor  without  changing  the  value  of  the  term.  The 
condition  that  <Pcuy  !>  shall  represent  only  outward  moving  particles  is 
thus  that  it  shall  contain  the  factor 

{ll(W'-W)-irr8(W'-W)}.  (35) 

It  is  interesting  to  note  that  this  factor  is  of  the  form  of  the  right- 
hand  side  of  equation  (15)  of  § 15. 

With  A given  by  (33),  expression  (30)  vanishes  and  the  value’ of 
<r0<£|>  for  large  r is  given  by  expression  (3J)  alone,  thus 

(r0<j>\>  = - 'Irrh-lc-b-1  IF/(P'0  x)eiiyrtn. 

This  may  be  generalized  to 

<rdf  |>  = -2Trh-lc-h-1W’f(P'u>x)eiP'rl\ 

giving  the  value  of  <jd<f>  |>  for  any  direction  6,  <f>  in  terms  of/(P'coy) 
for  the  same  direction  labelled  by  to,  x.  This  is  of  the  form  (12)  with 

u(6<j>)  = -2nh-ic-HV'JXP’wX) 

and  thus  represents  a distribution  of  outward  moving  particles  of 
momentum  P'  whose  number  is 


4tt21F'P' 


per  unit  solid  angle  per  unit  time.  This  distribution  is  the  one 
represented  by  the  <Pcoyj>  of  (34). 

From  this  general  result  we  can  infer  that,  whenever  we  have  a 


representative  <Pcoy|>  representing  only  outward  moving  particles 
and  satisfying  an  equation  of  the  type  (23),  the  number  per  unit  solid 
angle  per  unit  time  of  these  particles  is  given  by  (36).  If  this  < PcoX  j> 
occurs  in  a problem  in  which  the  number  of  incident  particles  is  one 
per  unit  volume,  it  will  correspond  to  a scattering  coefficient  of 


amount 


4it21F01F'P' 

hciP° 


l/(^x)!2- 


(37) 


It  is  only  the  value  of  the  function  f(PwX)  for  the  point  P — P'  that 
is  of  importance. 
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. we  “ow  aPP!y  this  general  theory  to  our  equations  (21)  and 
(22),  we  have 

f(Pu>X)  = A5  (Pu>x<x'  | V | P0to0x°a0>. 

Hence  from  (37)  the  scattering  coefficient  is 

4ir2h2W0W'P'/c*P0.  |<P'cUxat'iF|P0a>V«°>|2-  (38) 

If  one  neglects  relativity  and  puts  W°W'/c*  = m2,  this  result  reduces 
to  the  result  (15)  obtained  in  the  preceding  section  by  means  of 
Green’s  theorem. 

51.  Dispersive  scattering 

We  shall  now  determine  the  scattering  when  the  incident  particle 
is  capable  of  being  absorbed,  that  is,  when  ohr  unperturbed  system 
of  scatterer  plus  particle  has  closed  stationary  states  with  the  particle 
absorbed.  The  existence  of  these  closed  states  for  the  unperturbed 
system  will  be  found  to  have  a considerable  effect  on  the  scattering 
for  the  perturbed  system,  and  indeed  an  effect  that  depends  very 
much  on  the  energy  of  the  incident  particle,  giving  rise  to  the  pheno- 
menon of  dispersion  in  optics  when  the  incident  particle  is  taken  to 
be  a photon. 

We  use  a representation  for  which  the  basic  kets  correspond  to 
the  stationary  states  of  the  unperturbed  system,  as  was  the  case  with 
the  p-representation  of  the  preceding  section.  We  take  these  station- 
ary states  to  be  the  states  (pV)  for  which  the  particle  has  a definite 
momentum  p'  and  the  scatterer  is  in  a definite  state  a',  together  with 
the  closed  states,  k say,  which  form  a separate  discrete  set,  and 
assume  that  these  states  are  all  independent  and  orthogonal.  This 
assumption  is  not  accurate  when  the  particle  is  an  electron  or  atomic 
nucleus,  since  in  this  case  for  an  absorbed  state  k the  particle  will 
still  certainly  be  somewhere,  so  that  one  would  expect  to  be  able  to 
expand  |A>  in  terms  of  the  eigenkets  [xV>  of  xy  y,  z,  and  the  <*’s, 
and  hence  also  in  terms  of  the  |pV>’s.  On  the  other  hand,  when  the 
particle  is  a photon  it  will  no  longer  exist  for  the  absorbed  states, 
which  are  then  certainly  independent  of  and  orthogonal  to  the  states 
(pV)  for  which  the  particle  does  exist.  Thus  the  assumption  is  valid 
in  this  case,  which  is  an  important  practical  one. 

Since  we  are  concerned  with  scattering,  we  must  still  deal  with 
stationary  states  of  the  whole  system.  We  shall  now,  however,  have 
to  work  to  the  second  order  of  accuracy,  so  that  we  cannot  use  merely 
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the  first-order  equation  (3),  but  must  use  also  (4).  Equation  (3) 
becomes,  when  written  in  terms  of  representatives  in  our  present 
representation, 

(JF'-fF)<p</|l>  = <p«'|F|0>, 

(E' —Ek)(k\iy  = <£|F|0>, 

where  W is  the  function  of  E'  and  the  a”s  given  by  (18)  and  Ek  is  the 
energy  of  tlie  stationary  state  k of  the  unperturbed  system.  Similarly, 
equation  (4)  becomes 

(H"_lF)<p*'|2>  = <p«'|F|l>,  1 

[ (^) 

(E'-EkKk\2y  = {k\v\iy.  J 

Expanding  the  right-hand  sides  by  matrix  multiplication,  we  get 
(IF'— TF)<pa'|2> 

- 2 f <p«'|K|Pv>  dy  <Pvii>+  I<pa'|K|F><r|i>, 

a'  * k ' 

(E'-EkKk  |2> 

= 2 f <*|K|pv>  dy  <PV|i>+  2 <k\v\k"xk”\iy. 

<x‘  J k' 

The  ket  |0>  is  still  given  by  (19),  so  (39)  may  be  written 

(W"-JF)<pa'|l>  = A!<pa'|F|p0a°>,  (42) 

(E1 — Ek)(k\iy  — A.l<A;|F|p0a0>.  (43) 

We  may  assume  that  the  matrix  elements  (k'\V\k"')  of  F vanish, 
since  these  matrix  elements  are  not  essential  to  the  phenomena  under 
investigation,  and  if  they  did  not  vanish  it  would  mean  simply  that 
the  absorbed  states  k had  not  been  suitably  chosen.  We  shall  further 
assume  that  the  matrix  elements  <p  V | F jp"a">  are  of  the  second  order 
of  smallness  when  the  matrix  elements  <&'|F|p"</>,  <pV | V\k”)  are 
taken  to  be  of  the  first  order  of  smallness.  This  assumption  will  be 
justified  for  the  case  of  photons  in  § 64.  We  now  have  from  (43)  and 
(42)  that  <&|  1 > is  of  the  first  order  of  smallness,  provided  E'  does  not 
lie  near  one  of  the  discrete  set  of  energy -levels  Ek,  and  <pa'|l>  is  of 
the  second  order.  The  value  of  <pcx'|2>  to  the  second  order  will  thus 
be  given,  from  the  first  of  equations  (41),  by 

(IF'  — W)<pat'|2>  = W | <poc'\V\k"Xk"\V\p'>a°)l(E’-Ek.). 
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The  total  correction  in  the  wave  function  to  the  second  order,  namely 
<p«'  1 1>  plus  <pa'  1 2 >,  therefore  satisfies 
( W'—  W){<Pot'  1 1>  + <pa'  |2>} 

= pa' | F | p0a0>  + 2 <p«'|F|fc><fc|F|pV>/(J5'-J5fc)}. 

k 


This  equation  is  of  the  type  (23),  provided  is  such  that  W > me2, 
which  means  that  a’  as  a final  state  for  the  scatterer  is  not  incon- 
sistent with  the  law  of  conservation  of  energy.  We  can  therefore  infer 
from  the  general  result  (37)  that  the  scattering  coefficient  is 


4TT2h2W°W'P' 

e*P°" 


<pV|F|pV>>+]T 


<pV|F|fc><A;|F|p0a0> 

E'-El 


(44) 


The  scattering  may  now  be  considered  composed  of  two  parts, 
a part  that  arises  from  the  matrix  element  <p'od'|F|p°a°>  of  the  per- 
turbing energy  and  a part  that  arises  from  the  matrix  elements 
<pV|F|fc>  and  <fc|F|p°a°>.  The  first  part,  which  is  the  same  as  our 
previously  obtained  result  (38),  may  be  called  the  direct  scattering. 
The  second  part  may  be  considered  as  arising  from  an  absorption  of 
the  incident  particle  into  some  state  k,  followed  immediately  by  a 
re-emission  in  a different  direction,  and  is  like  the  transitions  through 
an  intermediate  state  considered  in  § 44.  The  fact  that  we  have  to 
add  the  two  terms  before  taking  the  square  of  the  modulus  denotes 
interference  between  the  two  kinds  of  scattering.  There  is  no  experi- 
mental way  of  separating  the  two  kinds,  the  distinction  between 
them  being  only  mathematical. 


52.  Resonance  scattering 

Suppose  the  energy  of  the  incident  particle  to  be  varied  con- 
tinuously while  the  initial  state  a"  of  the  scatterer  is  kept  fixed,  so 
that  the  total  energy  E'  or  H'  varies  continuously.  The  formula  (44) 
now  shows  that  as  E'  approaches  one  of  the  discrete  set  of  energy- 
levels  Ek,  the  scattering  becomes  very  large.  In  fact,  according  to 
formula  (44)  the  scattering  should  be  infinite  when  E'  is  exactly  equal 
to  an  Ek.  An  infinite  scattering  coefficient  is,  of  course,  physically 
impossible,  so  that  we  can  infer  that  the  approximations  used  in 
deriving  (44)  are  no  longer  legitimate  when  E'  is  close  to  an  Ek.  To 
investigate  the  scattering  in  this  case  we  must  therefore  go  back  to 
the  exact  equation  ^ \H")  = V\ H'}, 

equation  (2)  of  § 43  with  E'  written  for  H',  and  use  a different  method 
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of  approximating  to  its  solution.  This  exact  equation,  written  in 
terms  of  representatives  like  (41),  becomes 


= <p«' ! v\p‘ v>d^'<pv  !//'>+ | <p«'i  v\k"  ><r  !//'>, 

(P'-Pk)<*|//'> 

- I / <*|F|pV>  Vp"  <pV|W'>+  £ <*|  F|F><F Iff'). 


(45) 


Let  us  take  one  particular  Ek  and  consider  the  case  when  E'  is  close 
to  it.  The  large  term  in  the  scattering  coefficient  (44)  now  arises  from 
those  elements  of  the  matrix  representing  F that  lie  in  row  k or  in 
column  k,  i.e.  those  of  the  type  <&jF|pa'>  or  <pa'|F|fc>.  The  scatter- 
ing arising  from  the  other  matrix  elements  of  F is  of  a smaller  order 
of  magnitude.  This  suggests  that  in  our  exact  equations  (45)  we  should 
make  the  approximation  of  neglecting  all  the  matrix  elements  of  F 
except  the  important  ones,  which  are  those  of  the  type  <pot'|F|A:>  or 
<fc|F|pa'>,  where  a'  is  a state  of  the  scatterer  that  has  not  too  much 
energy  to  be  disallowed  as  a final  state  by  the  law  of  conservation  of 
energy.  These  equations  then  reduce  to 

(IF'—  JF)<p«'|J/'>  = <p«'!F|ifcX*:Lff'>, 

(E’-EkKk\H'y  = 2 f <&|F|pa'>  d3p  <pa'|#'>, 

at'  * 

the  a!  summation  being  over  those  values  of  a!  for  which  IF'  given 
by  (18)  is  > me2.  These  equations  are  now  sufficiently  simple  for  us 
to  be  able  to  solve  exactly  without  further  approximation. 

From  the  first  of  equations  (46)  we  obtain  by  division 

<p«'|ff'>  = <p«'|F|fc><fc|JP>/(JF'-W)+AS(W'-IF).  (47) 


We  must  choose  A,  which  may  be  any  function  of  the  momentum 
p and  oc',  such  that  (47)  represents  the  incident  particles  corresponding 
to  |0>  or  A*  | p°a°>  together  with  only  outward  moving  particles.  [The 
representative  of  MfpV)  is  actually  of  the  form  AS(1F'— IF),  since 
the  conditions  a'  ==  a0  and  p = p°  for  it  not  to  vanish  lead  to 
JF'  = E'—H8{<x’)  = E'—Hs(oc°)  = JF°  = IF.]  Thus  (47)  must  be 

<p«'!//'>  = A»<pa'|p°a<>>  + 

+ <p«'  I F !*><*  |//'>{l/(  w’ — IF) -in  8(  W’ — W)},  (48) 
and  from  the  general  formula  (37)  the  scattering  coefficient  will  be 

47r2jfojF'P7Ac4P°.  |<p'a'|Flfc>|2|<fc|tf'>l2.  (49) 
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It  remains  for  us  to  determine  the  value  of  We  can  do  this 

by  substit  uting  for  <p,\'|//'>  in  the  second  of  equations  (40)  its  value 
given  by  (48).  This  gives 

(E'-EkKk\II‘>  = A*<*|F|pV>  + 

+ <k\H's2  f |<i*iF|pa'>|2(l/(W'"-W)-/7rS(ir'-W/)|  d3p 

a'  * 

— A*<4 1 V | p®*®) + <* \H’>(a-ib), 


where  a = J | |<*iF|p«'>|*  d*p>(W’-  W)  (50) 

a'  *' 

iul<1  6 = I I (Jk ! Fj pot') |2 8(  IF'  — ■ IF)  dsp 

a'  J 

= Wl/J7  \<k\V\Pwxoc')\28(W,~W)P2dPsmwdwdx 

— 7r  P'JV'c~2  j j |f£|F[P'a>xa')|2sin ai  daidy.  (51) 

Thus  <:k\H')  = h*<k\V\p°oc«y/(E'~Ek-a+ib).  (52) 

Note  that  a and  b are  real  and  that  b is  positive. 

This  value  for  (k\H')  substituted  in  (49)  gives  for  the  scattering 
coefficient 

47i-2/j2IF0IF'P'  |<pV|F|&>|2|<&|FjpV)>|2 

c4P°  ( E'-Ek-a)2+b 2 • ( 6) 

One  can  obtain  the  total  effective  area  that  the  incident  particle 
must  hit  in  order  to  be  scattered  anywhere  by  integrating  (53)  over 
all  directions  of  scattering,  i.e.  by  integrating  over  all  directions  of 
the  vector  p'  with  its  magnitude  kept  fixed  at  P',  and  then  summing 
over  all  ex'  that  are  to  be  taken  into  consideration,  i.e.  for  which 
IF'  > me2.  This  gives,  with  the  help  of  (51),  the  result 


4irh2W°  &|<£|F|p°ix0>|2 
c2P°  (E' — Ek— o)2 + 6*  ‘ 


(54) 


If  we  suppose  E'  to  vary  continuously  through  the  value  Ek,  the 
main  variation  of  (53)  or  (54)  will  be  due  to  the  small  denominator 
{E' — Ek—a)2-\-b2.  If  we  neglect  the  dependence  of  the  other  factors 
in  (53)  and  (54)  on  E' , then  the  maximum  scattering  will  occur  when 
E'  has  the  value  Ek-\-a  and  the  scattering  will  be  half  its  maximum 
when  E differs  from  this  value  by  an  amount  b.  The  large  amount  of 
scattering  that  occurs  for  values  of  the  energy  of  the  incident  particle 
that  make  E'  nearly  equal  to  Ek  give  rise  to  the  phenomenon  of  an 
absorption  line.  The  centre  of  the  line  is  displaced  by  an  amount 
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a from  the  resonance  energy  of  the  incident  particle,  i.e.  the  energy 
which  would  make  the  total  energy  just  Ek,  while  the  quantity  6 is 
what  is  sometimes  called  the  half-width  of  the  line. 

53.  Emission  and  absorption 

For  studying  emission  and  absorption  we  must  consider  non- 
stationary states  of  the  system  and  must  use  the  perturbation  method 
of  § 44.  To  determine  the  coefficient  of  spontaneous  emission  we  must 
take  an  initial  state  for  which  the  particle  is  absorbed,  corresponding 
to  a ket  |fc>,  and  determine  the  probability  that  at  some  later  time 
the  particle  shall  be  on  its  way  to  infinity  with  a definite  momentum. 
The  method  of  § 46  can  now  be  applied.  From  the  result  (39)  of  that 
section  we  see  that  the  probability  per  unit  time  per  unit  range  of  co 
and  y,  of  the  particle  being  emitted  in  any  direction  w',  x with  the 
scatterer  being  left  in  state  a!  is 

2Trh-x\SW'to'x'<x\V\k)>\z,  (65) 

provided,  of  course,  that  a is  such  that  the  energy  IF',  given  by  (18), 
of  the  particle  is  greater  than  me2.  For  values  of  a!  that  do  not  satisfy 
this  condition  there  is  no  emission  possible.  The  matrix  element 
<IF'aj'xV|F|fc>  here  must  refer  to  a representation  in  which  W,  w,  x, 
and  a are  diagonal  with  the  weight  function  unity.  The  matrix 
elements  of  V appearing  in  the  three  preceding  sections  refer  to  a repre- 
sentation in  which  px,  py,  pz  are  diagonal  with  the  weight  function 
unity,  or  P,  w,  \ are  diagonal  with  the  weight  function  F2sincu. 
They  would  thus  refer  to  a representation  in  which  W,  w,  x are 
diagonal  with  the  weight  function  dP/dW.P2 sina>  = WPjc2  .sinw. 
Thus  the  matrix  element  ( W a>' x a!  \ V \k)  in  (55)  is  equal  to 
(jg'pyc2  sinw')i  times  our  previous  matrix  element  < W'co’x'ot  '\v\ky 
or  <pV|F|£>,  so  that  (55)  is  equal  to 

o_  \U'  p’ 

^^sinu/|<pV|F|*>|*. 

The  probability  of  emission  per  unit  solid  angle  per  unit  time,  with 
the  scatterer  simultaneously  dropping  to  state  oc’,  is  thus 
o_  w p' 

f ^f-|<pV|F|fc>|*.  (56) 

To  obtain  the  total  probability  per  unit  time  of  the  particle  being 
emitted  in  any  direction,  with  any  final  state  for  the  scatterer,  we 
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ST-fi.  £Zl  by  m There  <.  *»  J****  £ 
L 6el»ee»  It*  total  emission  coefficient  and  the  half-nnd  of 

“StT. - — absorption.  This  retire,  «-.  « *- 
on  initial  state  for  which  the  particle  is  certainly  not  absorbed  but 
incident  with  a definite  momentum.  Thus  the  ket  corresponding  to 
the  initial  state  must  be  of  the  form  (19).  We  must  now  determme 
the  probability  of  the  particle  being  absorbed  after  time  t.  S 

isynot  on/of  a continuous  range,  we  cannot  use  directly 

the  result  (39)  of  §46.  If,  however,  we  take  ; 

|o>  - Ip°«°>.  ^ (57) 

as  the  ket  corresponding  to  the  initial  state  th< i analysis  of * « «*  « 
is  still  applicable  as  far  as  equation  (36)  and  shows  US  that  the  proba 
bility  of  the  particle  being  absorbed  into  state  k after  time 
2 1 (lc  | V | p°a°>  |2[  1 - cos {(Ek—  E'  )tjh}]/ (Ek-E'f. 

This  corresponds  to  a distribution  of  incident  particles  of  density 
I 3 0Wing  to  the  omission  of  the  factor  h*  from  (57),  as  compared 
with  (19)°  The  probability  of  there  being  an  absorption  after  tir 
7wtn  there  is  one  incident  particle  crossing  unit  area  per  umt  time 

is  therefore  /P;as 

2h3W°lc2P°.  |<fc|T|p0ot°>|2[l  — cos{(Ek—  E‘ ')tih}\ l(Ek—  )■  ( 

To  obtain  the  absorption  coefficient  we  must  conmto  the  inmdent 
particles  not  all  to  have  exactly  the  same  energy  8 . 

C to  have  a distribution  of  energy  values  about  the  correct  value 

E H (a°)  required  for  absorption.  If  we  take  a beam  of  made 

particles  LJ. ting  of  one  crossing  unit  area  per  fl-JF  "U, 
enerffv  range  the  probability  of  there  being  an  absorption  after  ti 
: wXgX  by  the  integral  of  (58)  with  respect  to  * This  integral 
may  be  evaluated  in  the  same  way  as  (37)  of  § 46  and  is  equ 
4n2h2  W°t/c2P° . | (k  | V | P°<*°>  I2- 

The  probability  per  unit  time  of  an  absorption  taking  place  with  an 
- hicident  beam  o^  one  particle  per  unit  area  per  unit  time  per  unit 

energy  range  is  therefore 

4n2h2  W°/c2P°  . | <k  | V | pV> \2,  (59) 

which  is  the  absorption  coefficient. 
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_ n,e  connexion  between  the  absorption  and  emission  coefficients 
(59)  and  (56)  and  the  resonance  scattering  coefficients  calculated  in 
the  preceding  section  should  be  noted.  When  the  incident  beam  does 
not  consist  of  particles  all  with  the  same  energy,  but  consists  of  a unit 
distribution  of  particles  per  unit,  energy  range  crossing  unit  area  per 
unit  time,  the  total  number  of  incident  particles  with  energies  near 
an  absorption  line  that  get  scattered  will  be  given  by  the  internal 
of  (54)  with  respect  to  E'.  If  one  neglects  the  dependence  of  "the 
numerator  of  (54)  on  E',  this  integral  will,  since 


J {E’-Ek-a)*+ b*dE'  ~ 7r’ 

— CO 

have  just  the  value  (59).  Thus  the  total  number  of  scattered  particles 
in-  the  neighbourhood  of  an  absorption  line  is  equal  to  the  total  number 
absorbed.  We  can  therefore  regard  all  these  scattered  particles  as 
absorbed  particles  that  are  subsequently  re-emitted  in  a different 
direction.  Further,  the  number  of  particles  in  the  neighbourhood  of 
the  absorption  line  that  get  scattered  per  unit  solid  angle  about  a 
given  direction  specified  by  p'  and  then  belong  to  scatterers  in  state 
oc  will  be  given  by  the  integral  with  respect  to  E'  of  (53),  which 
integral  has  in  the  same  way  the  value 


A-nViHVoWP'  tt 
c4/*1  b 


i<p,a'!F|£>|2|<*|F|p°ao>|2. 


Tins  is  just  equal  to  the  absorption  coefficient  (59)  multiplied  by  the 
emission  coefficient  (56)  divided  by  26/*,  the  total  emission  coefficient 
This  is  in  agreement  with  the  point  of  view  of  regarding  the  resonance 
scattered  particles  as  those  that  are  absorbed  and  then  re-emitted, 
with  the  absorption  and  emission  processes  governed  independently 
each  by  its  own  probability  law,  since  this  point  of  view  would 
make  the  fraction  of  the  total  number  of  absorbed  particles  that  are 
re-emitted  m a unit  solid  angle  about  a given  direction  just  the 
emission  coefficient  for  this  direction  divided  by  the  total  emission 
coefficient. 


IX 

SYSTEMS  CONTAINING  SEVERAL  SIMILAR  PARTICLES 

54.  Symmetrical  and  antisymmetrical  states 

If  a system  in  atomic  physics  contains  a number  of  particles  of  the 
same  kind,  e.g.  a number  of  electrons,  the  particles  are  absolutely 
indistinguishable  one  from  another.  No  observable  change  is  made 
when  two  of  them  are  interchanged.  This  circumstance  gives  rise  to 
some  curious  phenomena  in  quantum  mechanics  having  no  analogue 
in  the  classical  theory,  which  arise  from  the  fact  that  in  quantum 
mechanics  a transition  may  occur  resulting  in  merely  the  interchange 
of  two  similar  particles,  which  transition  then  could  Hot  be  detected 
by  any  observational  means.  A satisfactory  theory  ought,  of  course, 
to  count  two  observationally  indistinguishable  states  as  the  same 
state  and  to  deny  that  any  transition  does  occur  when  two  similar 
particles  exchange  places.  We  shall  find  that  it  is  possible  to  reformu- 
late the  theory  so  that  this  is  so. 

Suppose  we  have  a system  containing  n similar  particles.  We  may 
take  as  our  dynamical  variables  a set  of  variables  describing  the 
first  particle,  the  corresponding  set  £,  describing  the  second  particle, 
and  so  on  up  to  the  set  describing  the  nth  particle.  We  shall  then 
have  the  £r’s  commuting  with  the  |8’s  for  r a.  (We  may  require 
certain  extra  variables,  describing  what  the  system  consists  of  in 
addition  to  the  n similar  particles,  but  it  is  not  necessary  to  mention 
these  explicitly  in  the  present  chapter.)  The  Hamiltonian  describing 
the  motion  of  the  system  will  now  be  expressible  as  a function  of  the 
&.£*»•••>£«•  The  fact  that  the  particles  are  similar  requires  that  the 
Hamiltonian  shall  be  a symmetrical  function  of  the  £ 
shall  remain  unchanged  when  the  sets  of  variables  are  interchanged 
or  permuted  in  any  way.  This  condition  must  hold,  no  matter  what 
perturbations  are  applied  to  the  system.  In  fact,  any  quantity  of 
physical  significance  must  be  a symmetrical  function  of  the  £ s. 

Let  1%),  j , be  kets  for  the  first  particle  considered  as  a dynami- 

cal system  by  itself.  There  will  be  corresponding  kets  IfljX  j b2y,...  for 
the  second  particle  by  itself,  and  so  on.  We  can  get  a ket  for  the 
assembly  by  taking  the  product  of  kets  for  each  particle  by  itself, 
for  example 

|Ol>l*8>|C8>-IS,»>  = W 
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say,  according  to  the  notation  of  (65)  of§  20.  The  ket  (1)  corresponds 
to  a special  kind  of  state  for  the  assembly,  which  may  be  described 
by  saying  that  each  particle  is  in  its  own  state,  corresponding  to  its 
own  factor  on  the  left-hand  side  of  (1).  The  general  ket  for  the 
assembly  is  of  the  form  of  a sum  or  integral  of  kets  like  (1),  and 
corresponds  to  a state  for  the  assembly  for  which  one  cannot  say  that 
each  particle  is  in  its  own  state,  but  only  that  each  particle  is  partly 
in  several  states,  in  a way  which  is  correlated  with  the  other  particles 
being  partly  in  several  states.  If  the  kets  \at\  are  a set  of 

basic  kets  for  the  first  particle  by  itself,  the  kets  |o2>,  |62>,.„  will  be 
a set  of  basic  kets  for  the  second  particle  by  itself,  and  so  on,  and  the 
kets  ( 1 ) will  be  a set  of  basic  kets  for  the  assembly.  We  call  the  repre- 
sentation provided  by  such  basic  kets  for  the  assembly  a symmetrical 
representation , as  it  treats  all  the  particles  on  the  same  footing. 

In  (1)  we  may  interchange  the  kets  for  the  first  two  particles  and 
get  another  ket  for  the  assembly,  namely 

l&i>K>|c3>...|gn>  = !V2c  3...gn>. 

More  generally , we  may  interchange  the  role  of  the  first  two  particles 
in  any  ket  for  the  assembly  and  get  another  ket  for  the  assembly. 
The  process  of  interchanging  the  first  two  particles  is  an  operator 
which  can  be  applied  to  kets  for  the  assembly,  and  is  evidently  a 
linear  operator,  of  the  type  dealt  with  in  § 7.  Similarly,  the  process 
of  interchanging  any  pair  of  particles  is  a linear  operator,  and  by 
repeated  applications  of  such  interchanges  we  get  any  permutation 
of  the  particles  appearing  as  a linear  operator  which  can  be  applied 
to  kets  for  the  assembly.  A permutation  is  called  an  even  permutation 
or  an  odd  permutation  according  to  whether  it  can  be  built  up  from 
an  even  or  an  odd  number  of  interchanges. 

A ket  for  the  assembly  |A>  is  called  symmetrical  if  it  is  unchanged 
by  any  permutation,  i.e.  if 


iW  = i*>  (2) 

for  any  permutation  P.  It  is  called  antisymmetrical  if  it  is  unchanged 
by  any  even  permutation  and  has  its  sign  changed  by  any  odd 
permutation,  P|X>=±|X>,  (3) 

the  -f  or  — sign  being  taken  according  to  whether  P is  even  or  odd. 
The  state  corresponding  to  a symmetrical  ket  is  called  a symmetrical 
state,  and  the  state  corresponding  to  an  antisymmetrical  ket  is  called 
an  antisymmetrical  state.  In  a symmetrical  representation,  the  repre- 
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sentative  of  a symmetrical  ket  is  a symmetrical  function  of  the 
variables  referring  to  the  various  particles  and  the  representative  of 
an  antisymmetrical  ket  is  an  antisymmetrical  function. 

In  the  Schrodinger  picture,  the  ket  corresponding  to  a state  of  the 
assembly  will  vary  with  time  according  to  Schrodinger’s  equation  of 
motion.  If  it  is  initially  symmetrical  it  must  always  remain  sym- 
metrical, since,  owing  to  the  Hamiltonian  being  symmetrical,  there 
is  nothing  to  disturb  the  symmetry.  Similarly  if  the  ket  is  initially 
antisymmetrical  it  must  always  remain  antisymmetrical.  Thus  a 
state  which  is  initially  symmetrical  always  remains  symmetrical  and 
a state  which  is  initially  antisymmetrical  always  remains  antisym- 
metrical.  In  consequence,  it  may  be  that  for  a particular  kind  o 
particle  only  symmetrical  states  occur  in  na’fcure,  or  on  y an  1 
symmetrical  states  occur  in  nature.  If  either  of  these  possibilities 
held,  it  would  lead  to  certain  special  phenomena  for  the  particles  in 
question. 

Let  us  suppose  first  that  only  antisymmetrical  states  occur  in 
nature.  The  ket  (1)  is  not  antisymmetrical  and  so  does  not  corre- 
spond to  a state  occurring  in  nature.  From  (1)  we  can  in  general  form 
an  antisymmetrical  ket  by  applying  all  possible  permutations  to  it 
and  adding  the  results,  with  the  coefficient  -1  inserted  before  those 
terms  arising  from  an  odd  permutation,  so  as  to  get 

2 ±I,|ai^!sc3---5r»i/>>  ^ ^ 

the  -f  or  — sign  being  taken  according  to  whether  P is  even  or  odd. 
The  ket  (4)  may  be  written  as  a determinant 


l«l> 

l«2> 

l«3>  • ’ 

• 1«»>  | 

\bi> 

l&2> 

|&3>  • 

• \hn> 

|Cl> 

|c2> 

lC3>  • 

■ K> 

\9i> 

\9*> 

lsr3>  • 

■ \9n> 

and  its  representative  in  a symmetrical  representation  is  a determi- 
nant. The  ket  (4)  or  (5)  is  not  the  general  antisymmetrical  ket,  but 
is  a specially  simple  one.  It  corresponds  to  a state  for  the  assembly 
for  which  one  can  say  that  certain  particle-states,  namely  the  states 
a,b,c,...,g,  are  occupied,  but  one  cannot  say  which  particle  is  m 
which  state,  each  particle  being  equally  likely  to  be  in  any  state.  It 
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two  of  the  particle-states  a,b,c,...,g  are  the  same,  the  ket  (4)  or  (5) 
vanishes  and  does  not  correspond  to  any  state  for  the  assembly. 
Thus  two  panicles  cannot  occupy  the  same  state.  More  generally,  the 
occupied  states  must  be  all  independent,  otherwise  (4)  or  (5)  vanishes. 
This  is  an  important  characteristic  of  particles  for  which  only  anti- 
symmetrical  states  occur  in  nature.  It  leads  to  a special  statistics, 
which  was  first  studied  by  Fermi,  so  we  shall  call  particles  for  which 
only  antisymmetrical  states  occur  in  nature  fermions. 

Let  us  suppose  now  that  only  symmetrical  states  occur  in  nature. 
The  ket  (1)  is  not  symmetrical,  except  in  the  special  case  when  all  the 
particle-states  a,b,c,...,g  are  the  same,  but  we  can  always  obtain  a 
symmetrical  ket  from  it  by  applying  all  possible  permutations  to  it 
and  adding  the  results,  so  as  to  get 

^ P\a1bic3...gny.  (6) 

The  ket  (6)  is  not  the  general  symmetrical  ket,  but  is  a specially 
simple  one.  It  corresponds  to  a state  for  the  assembly  for  which  one 
can  say  that  certain  particle-states  are  occupied,  namely  the  states 
a,  b,  c,...,g,  without  being  able  to  say  which  particle  is  in  which  state. 
It  is  now  possible  for  two  or  more  of  the  states  a,b,c,...,g  to  be  the 
same,  so  that  two  or  more  particles  can  be  in  the  same  state.  In  spite 
of  this,  the  statistics  of  the  particles  is  not  the  same  as  the  usual 
statistics  of  the  classical  theory.  The  new  statistics  was  first  studied 
by  Bose,  so  we  shall  call  particles  for  which  only  symmetrical  states 
occur  in  nature  bosons. 

We  can  see  the  difference  of  Bose  statistics  from  the  usual  statistics 
by  considering  a special  case— that  of  only  two  particles  and  only  two 
independent  states  a and  b for  a particle.  According  to  classical 
mechanics,  if  the  assembly  of  two  particles  is  in  thermodynamic 
equilibrium  at  a high  temperature,  each  particle  will  be  equally  likely 
to  be  in  either  state.  There  is  thus  a probability  £ of  both  particles 
being  in  state  a,  a probability  £ of  both  particles  being  in  state  b, 
and  a probability  $ of  one  particle  being  in  each  state.  In  the  quan- 
tum theory  there  are  three  independent  symmetrical  states  for  the 
pair  of  particles,  corresponding  to  the  symmetrical  kets  foXa,), 
and  K>l&2>  + |a2>|6i>,  and  describable  as  both  particles  in 
state  o,  both  particles  in  state  b,  and  one  particle  in  each  state 
respectively.  For  thermodynamic  equilibrium  at  a high  temperature 
these  three  states  are  equally  probable,  as  was  shown  in  § 33,  so  that 
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there  is  a probability  \ of  both  particles  being  in  state,  a,  a probability 
£ of  both  particles  being  in  state  b,  and  a probability  | of  one  particle 
being  in  each  state.  Thus  with  Bose.  statistics  the  probability  of  two 
particles  being  in  the.  same,  state  is  greater  than  with  classical  statistics. 
Bose  statistics  differ  from  classical  statistics  in  the  opposite  direction 
to  Fermi  statistics,  for  which  the  probability  of  two  particles  being 
in  the  same  state  is  zero. 

In  building  up  a theory  of  atoms  on  the  lines  mentioned  at  the 
beginning  of§  38,  to  get  agreement  with  experiment  one  must  assume 
that  two  electrons  are  never  in  the  same  state.  This  rule  is  known  as 
Pauli's  exclusion  principle.  It  shows  us  that  electrons  are  fermions. 
Planck’s  law  of  radiation  shows  us  that  photons  are  bosons,  as  only  the 
Bose  statistics  for  photons  will  lead  to  Planck’s  ldw.  Similarly,  for 
each  of  the  other  kinds  of  particle  known  in  physics,  there  is  experi- 
mental evidence  to  show  either  that  they  are  fermions,  or  that  they 
are  bosons.  Protons,  neutrons,  positrons  are  fermions,  a-particles  are 
bosons.  It  appears  that  all  particles  occurring  in  nature  are  either 
fermions  or  bosons,  and  thus  only  antisymmetrieal  or  symmetrical 
states  for  an  assembly  of  similar  particles  are  met  with  in  practice. 
Other  more  complicated  kinds  of  symmetry  are  possible  mathemati- 
cally, but  do  not  apply  to  any  known  particles.  With  a theory  which 
allows  only  antisymmetrieal  or  only  symmetrical  states  for  a particu- 
lar kind  of  particle,  one  cannot  make  a distinction  between  two  stat  es 
which  differ  only  through  a permutation  of  the  particles,  so  that  the 
transitions  mentioned  at  the  beginning  of  this  section  disappear. 

55.  Permutations  as  dynamical  variables 

We  shall  now  build  up  a general  theory  for  a system  containing  n 
similar  particles  when  states  with  any  kind  of  symmetry  properties 
are  allowed,  i.e.  when  there  is  no  restriction  to  only  symmetrical  or 
only  antisymmetrieal  states.  The  general  state  now  will  not  be  sym- 
metrical or  antisymmetrieal,  nor  will  it  be  expressible  linearly  in 
terms  of  symmetrical  and  antisymmetrieal  states  when  n > This 
theory  will  not  apply  directly  to  any  particles  occurring  in  nature-, 
but  all  the  same  it  is  useful  for  setting  up  an  approximate  treatment 
for  an  assembly  of  electrons,  as  will  be  shown  in  § 58. 

We  have  seen  that  each  permutation  P of  the  n particles  is  a linear 
operator  which  can  be  applied  to  any  ket  for  the  assembly.  Hence 
we  can  regard  P as  a dynamical  variable  in  our  system  of  n particles. 
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There  are  n\  permutations,  each  of  which  can  be  regarded  as  a 
dynamical  variable.  One  of  them,  Px  say,  is  the  identical  permutation, 
which  is  equal  to  unity.  The  product  of  any  two  permutations  is  a 
third  permutation  and  hence  any  function  of  the  permutations  is 
reducible  to  a linear  function  of  them.  Any  permutation  P has  a 
reciprocal  P~l  satisfying 

pp-i  = p-ip  _ px  = j 

A permutation  P can  be  applied  to  a bra  (JC  | for  the  assembly, 
to  give  another  bra,  which  we  shall  denote  for  the  present  by  P<X|. 
If  P is  applied  to  both  factors  of  the  product  <X|  T>,  the  product 
must  be  unchanged,  since  it  is  just  a number,  independent  of  any 
order  of  the  particles.  Thus 

(P<xi)P|y>  = <X|y> 

showing  that  P<X | = <X| P"1  (7) 

Now_P<X|  is  the  conjugate  imaginary  of  P|X>  and  is  thus  equal  to 
<X|P,  and  hence  from  (7) 

p = p-'  (8) 

Thus  a permutation  is  not  in  general  a real  dynamical  variable,  its 
conjugate  complex  being  equal  to  its  reciprocal. 

Any  permutation  of  the  numbers  1,  2,  3,...,  n may  be  expressed  in 
the  cyclic  notation,  e.g.  with  n — 8 

Pa  = (143)(27)(58)(6),  (9) 

in  which  each  number  is  to  be  replaced  by  the  succeeding  number  in 
a bracket,  unless  it  is  the  last  in  a bracket,  when  it  is  to  be  replaced 
by  the  first  in  that  bracket.  Thus  P0  changes  the  numbers  12345678 
into  47138625.  The  type  of  any  permutation  is  specified  by  the 
partition  of  the  number  n which  is  provided  by  the  number  of  num- 
bers in  each  of  the  brackets.  Thus  the  type  of  Pa  is  specified  by  the 
partition  8 = 3+2+ 2+  1.  Permutations  of  the  same  type,  i.e.  corre- 
sponding to  the  same  partition,  we  shall  call  similar.  Thus,  for 
example,  Pa  in  (9)  is  similar  to 

P„  = (871)(35)(46)(2).  (10) 

The  whole  of  the  nl  possible  permutations  may  be  divided  into  sets 
of  similar  permutations,  each  such  set  being  called  a class.  The  per- 
mutation Pj  = 1 forms  a class  by  itself.  Any  permutation  is  similar 
to  its  reciprocal. 
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When  two  permutations  Pa  and  Pb  are  similar,  either  of  them  Pb 
may  be  obtained  by  making  a certain  permutation  Px  in  the  other 
Pa.  Thus,  in  our  example  (9),  (10)  we  can  take  Px  to  be  the  permuta- 
tion that  changes  14327586  into  87135462,  i.e.  the  permutation 

Px  = (18623)(475). 

Different  ways  of  writing  Pa  and  Pb  in  the  cyclic  notation  would  lead 
to  different  Px’s.  Any  of  these  PT’s  applied  to  the  product  P \X > 
would  change  it  into  Pb.Px\X),  i.e. 

PxP0\X)  = PbPx\X). 

Hence  Pb  = PxPaP~\  (11) 

which  expresses  the  condition  for  Pa  and  Pb  -to  be  similar  as  an 
algebraic  equation.  The  existence  of  any  Px '“Satisfying  (11)  is  suffi- 
cient to  show  that  Pa  and  Ph  are  similar. 


56.  Permutations  as  constants  of  the  motion 

Any  symmetrical  function  V of  the  dynamical  variables  of  all  the 
particles  is  unchanged  by  the  application  of  any  permutation  P,  so 
P applied  to  the  product  V\X}  affects  only  the  factor  \X>,  thus’ 

PF|X>  = VP\Xy. 

Hence  PF  VP,  (12) 

showing  that  a symmetrical  function  of  the.  dynamical  variables  com- 
mutes with  every  permutation.  The  Hamiltonian  is  a symmetrical 
function  of  the  dynamical  variables  and  thus  commutes  with  every 
permutation.  It  follows  that  each  permutation  is  a constant  of  the 
motion.  This  holds  even  if  the  Hamiltonian  is  not  constant.  If  |X<> 
is  any  solution  of  Schrodinger’s  equation  of  motion,  Pj  AT)  is  another. 

In  dealing  with  any  system  in  quantum  mechanics,  when  we  have 
found  a constant  of  the  motion  a,  we  know  that  if  for  any  state  of 
motion,  * initially  has  the  numerical  value  a,  then  it  always  has  this 
value,  so  that  we  can  assign  different  numbers  a'  to  the  different 
states  and  so  obtain  a classification  of  the  states.  The  procedure  is 
not  so  straightforward,  however,  when  we  have  several  constants  of 
the  motion  a which  do  not  commute  (as  is  the  case  with  our  permuta 
tions  P),  since  we  cannot  in  general  assign  numerical  values  for  all 
the  ix  s simultaneously  to  any  state.  Let  us  first  take  the  case  of  a 
system  whose  Hamiltonian  does  not  involve  the  time  explicitly.  The 
existence  of  constants  of  the  motion  a which  do  not  commute  is 
then  a sign  that  the  system  is  degenerate.  This  is  because,  for  a 
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non-degenerate  system,  the  Hamiltonian  H by  itself  forms  a complete 
set  of  commuting  observables  and  hence,  from  Theorem  2 of  § 19,  each 
of  the  a’s  is  a function  of  H and  therefore  commutes  with  any  other  a. 

We  must  now  look  for  a function  ft  of  the  a’s  which  has  one  and 
the  same  numerical  value  ft'  for  ad  those  states  belonging  to  one 
energy-level  H',  so  that  we  can  use  ft  for  classifying  the  energy-levels 
of  the  system.  We  can  express  the  condition  for  ft  by  saying  that  it 
must  be  a function  of  H and  must  therefore  commute  with  every 
dynamical  variable  that  commutes  with  H , i.e.  with  every  constant 
of  the  motion.  If  the  ex's  are  the  only  constants  of  the  motion,  or  if 
they  are  a set  that  commute  with  all  other  independent  constants  of 
the  motion,  our  problem  reduces  to  finding  a function  ft  of  the  <x’s 
which  commutes  with  all  the  ex’s.  We  can  then  assign  a numerical 
value  ft'  for  ft  to  each  energy-level  of  the  system.  If  w’e  can  find 
several  such  functions  ft.  they  must  ail  commute  with  each  other,  so 
that  we  can  give  them  all  numerical  values  simultaneously.  We  ob- 
tain thus  a classification  of  the  energy -levels.  When  the  Hamiltonian 
involves  the  time  explicitly  one  cannot  talk  about  energy -levels,  but 
the  ft  s will  still  give  a useful  classification  of  the  states. 

We  follow  this  method  in  dealing  with  our  permutations  P.  We 
must  find  a function  y of  the  P’s  such  that  PyP-1  ~ y for  every  P. 
It  is  evident  that  a possible  y is  ^ Pr,  the  sum  of  all  the  permutations 
in  a certain  class  c,  i.e.  the  sum  of -a  set  of  similar  permutations,  since 
2 PPC  P~A  must  consist  of  the  same  permutations  summed  in  a differ- 
ent order,  fhere  will  be  one  such  y for  each  class.  Further,  there  can 
be  no  other  independent  y,  since  an  arbitrary  function  of  the  P's  can 
be  expressed  as  a linear  function  of  them  with  numerical  coefficients, 
and  it  will  not  then  commute  with  every  P unless  the  coefficients  of 
similar  P’s  are  always  the  same.  We  thus  obtain  all  the  y’s  that  cati 
fie  used  for  classifying  the  states.  It  is  convenient  to  define  each  y as 
an  average  instead  of  a sum,  thus 

Xc  = n~  1 1 Pc, 

where  nc  is  the  number  of  P’s  in  f he  class  c.  An  alternative  expression 
lor*c,s  X*  = *'■-' 2 PPeP-1,  (13) 

p 

the  sum  being  extended  over  all  the  n\  permutations  P,  it  being  easy 
to  verify  that  this  sum  contains  each  member  of  the  class  c the  same 
number  of  times.  For  each  permutation  P there  is  one  y,  y(P)  say, 
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equal  to  the  average  of  all  permutations  similar  to  P.  One  of  the 
x’s  is  X(Px)  ~ !• 

The  constants  of  the  motion  xv  *2>— > Xm  obtained  In  this  way  will 
each  have  a definite  numerical  value  for  every  stationary  state  of  the 


system,  in  the  case  when  the  Hamiltonian  does  not  involve  the  time 
explicitly,  and  also  in  the  general  case  can  be  used  for  classifying 
the  states,  there  being  one  set  of  states  for  every  permissible  set  of 
numerical  values  Xm  for  the  x’s.  Since  the  X’s  are  always 

constants  of  the  motion,  these  sets  of  states  will  be  exclusive,  i.e. 
transitions  will  never  take  place  from  a state  in  one  set  to  a state  in 


another. 

The  permissible  sets  of  values  x'  that  one  can  give  to  the  X’s  are 
limited  by  the  fact  that  there  exist  algebraic  relations  between  the 
x’s.  The  product  of  any  two  x’s,  x„  Xg<  is  of  course  expressible  as 
a linear  function  of  the  P’s,  and  since  it  commutes  with  every  P it 
must  be  expressible  as  a linear  function  of  the  x s>  thus 

XpXq  ~ Xl  + a2*2+  -+am  Xm< 

where  the  a’s  are  numbers.  Any  numerical  values  x'  that  ono  8ives 
to  the  x s must  be  eigenvalues  of  the  x's  and  must  satisfy  these  same 
algebraic  equations.  For  every  solution  x'  °f  these  equations  there 
is  one  exclusive  set  of  states.  One  solution  is  evidently  Xp  ~ ^ b)r 
every  Xp,  giving  the  set  of  symmetrical  states.  A second  obvious 
solution,  giving  the  set  of  antisymmetrical  states,  is  xP  ~ i1.  fche 
or  — sign  being  taken  according  to  whether  the  permutations  in 
the  class  p are  even  or  odd.  The  other  solutions  may  be  worked  out 
in  any  special  case  by  ordinary  algebraic  methods,  as  the  coefficients 
a in  (14)  inav  be  obtained  directly  by  a consideration  of  the  types 
of  permutation  to  which  the  x’s  concerned  refer.  Any  solution  is, 
apart  from  a certain  factor,  what  is  called  in  group  theory  a character 
of  the  group  of  permutations.  The  x’s  are  all  real  dy  namical  variables, 
since  each  P and  its  conjugate  complex  P-1  are  similar  and  will  occur 
added  together  in  the  definition  of  any  x,  so  that  the  X"s  must  be  all 
real  numbers. 

The  number  of  possible  solutions  of  the  equations  (14)  may  easily 
be  determined,  since  it  must  equal  the  number  of  different  eigen- 
values of  an  arbitrary  function  B of  the  X’s.  We  can  express  B as 
a linear  function  of  the  x’s  with  the  help  of  equations  (14);  thus 

B = 6lXx+6aXa+-+^mXm- 
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Similarly,  we  can  express  each  of  the  quantities  B 2,  B3,...,  Bm  as  a 
linear  function  of  the  y’s.  From  the  m equations  thus  obtained, 
together  with  the  equation  y(ij)  = 1,  we  can  eliminate  the  m un- 
knowns xv  Xs>-  > Xm>  obtaining  as  result  an  algebraic  equation  of 
degree  m for  B, 

Bm+clBm-1+c2Bm~2+...+cm  = 0. 

The  m solutions  of  this  equation  give  the  m possible  eigenvalues 
for  B,  each  of  which  will,  according  to  (15),  be  a linear  function  of  bv 
b2,...,  bm  whose  coefficients  are  a permissible  set  of  values  xi,  x%,---,  x'm- 
The  sets  of  values  x'  thus  obtained  must  be  all  different,  since  if 
there  were  fewer  than  m different  permissible  sets  of  values  x'  for  the 
X s,  there  would,  exist  a linear  function  of  the  y’s  every  one  of  whose 
eigenvalues  vanishes,  which  would  mean  that  the  linear  function  itself 
vanishes  and  the  y’s  are  not  linearly  independent.  Thus  the  number  of 
permissible  sets  of  numerical  values  for  the  y’s  is  just  equal  to  m,  which 
is  the  number  of  classes  of  permutations  or  the  number  of  partitions 
of  n.  This  number  is  therefore  the  number  of  exclusive  sets  of  states. 

All  dynamical  variables  of  physical  importance  and  all  observable 
quantities  are  symmetrical  between  the  particles  and  thus  commute 
with  all  the  P’s.  Thus  the  only  functions  of  the  P’s  of  physical 
importance  are  the  y’s.  The  states  corresponding  to  jy'>  and  to 
f{P)\x'y>  where  |y'>  is  any  eigenket  of  the  y’s  belonging  to  the  eigen- 
values x'  and  /(P)  is  any  function  of  the  P’s  such  that/(P)|y'>  ^ 0, 
are  observationally  indistinguishable  and  are  thus  physically  equiva- 
lent. There  is  a definite  number,  n(y')  say,  of  independent  kets  which 
can  be  formed  by  multiplying  jy'>  by  functions  of  the  P’s,  which 
number  depends  only  on  the  y”s.  It  is  the  number  of  rows  and 
columns  in  a matrix  representation  of  the  P’s  in  which  each  y is 
equal  to  y'.  If  |y'>  corresponds  to  a stationary  state,  n( y')  will  be 
its  degree  of  degeneracy  (so  far  as  concerns  degeneracy  caused  by  the 
symmetry  between  the  particles).  This  degeneracy  cannot  be  removed 
by  any  perturbation  that  is  symmetrical  between  the  particles. 

57.  Determination  of  the  energy -levels 

Let  us  apply  the  perturbation  method  of  § 43  and  make  a first-order 
calculation  of  the  energy-levels  in  the  case  when  the  Hamiltonian 
does  not  involve  the  time  explicitly.  We  suppose  that  for  our  unper- 
turbed stationary  states  of  the  assembly  each  of  the  similar  particles 
has  its  own  individual  state.  With  n particles,  we  shall  have  n of 
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these  states,  corresponding  to  kets  jet1),  |a2>,...,  ja">  say,  which  we 
assume  for  the  present  to  be  all  orthogonal.  The  ket  for  the  assembly 

isthen  |*>  = K>!«I>-K>>  (16) 

like  (1)  with  a1,  at2,...  instead  of  a,  &,... . If  we  apply  any  permutation 
P to  it  we  get  another  ket 

P|X>  - |4>|a2>...(c£>  (17) 

say,  r,  s,...,  z being  some  permutation  of  the  numbers  1,  2,...,  n, 
corresponding  to  another  stationary  state  of  the  assembly  with  the 
same  energy.  There  are  thus  altogether  n\  unperturbed  states  with 
this  energy,  if  we  assume  there  are  no  other  causes  of  degeneracy. 
According  to  the  method  of  § 43  when  the  unperturbed  system  is 
degenerate,  we  must  consider  those  elements  of  the  matrix  represent- 
ing the  perturbing  energy  V that  refer  to  two  states  with  the  same 
energy,  i.e.  those  of  the  type  < X \Pa  VPb  \ X).  These  will  form  a matrix 
with  ft!  rows  and  columns,  whose  eigenvalues  are  the  first-order 
corrections  in  the  energy-levels. 

We  must  now  introduce  another  kind  of  permutation  operator 
which  can  be  applied  to  kets  of  the  form  (.17),  namely  a permutation 
which  acts  on  the  indices  of  the  a’s.  We  denote  such  a permutation 
operator  by  P“.  The  essential  difference  between  the  P’s  and  the 
P“’s  may  be  seen  in  the  following  way.  Let  us  consider  a permutation 
in  the  general  sense,  say  that  consisting  of  the  interchange  of  2 and  3. 
This  may  be  interpreted  either  as  the  interchange  of  the  objects  2 and 
3 or  as  the  interchange  of  the  objects  in  the  places  2 and  3,  these  two 
operations  producing  in  general  quite  different  results.  The  first  of 
these  interpretations  is  the  one  that  gives  the  operators  P,  the  objects 
concerned  being  the  similar  particles.  A permutation  P can  be 
applied  to  an  arbitrary  ket  for  the  assembly.  A permutation  with  the 
second  interpretation  has  a meaning,  however,  only  when  applied 
to  a ket  of  the  form  (17),  for  which  each  of  the  particles  is  in  a ‘place’ 
specified  by  an  a,  or  to  a sum  of  kets  of  the  form  (17).  A permutation 
P may  be  considered  as  an  ordinary  dynamical  variable.  A permuta- 
tion P“  may  be  considered  as  a dynamical  variable  in  a restricted 
sense,  valid  when  one  is  dealing  only  with  states  obtainable  by  super- 
position of  the  various  states  (17).  This  is  the  case  for  our  present 
perturbation  problem. 

We  can  form  algebraic  functions  of  the  P“  which  will  be  other 
operators  applicable  to  kets  of  the  form  (17).  In  particular  we  can 
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form  y(P£),  the  average  of  all  P“’s  in  a certain  class  c.  This  must 
equal  xiPc)<  the7average  of  the  permutation  operators  P in  the  same 
class,  since  the  total  set  of  all  permutations  in  a given  class  must 
evidently  be  the  same  whether  the  permutations  are  applied  to  the 
particles  or  to  the  places  the  particles  are  in.  Any  P commutes  with 
any  P“,  i.e.  PaPg  = PgPa.  (18) 

By  labelling  the  a’s  by  the  same  numbers  1,  2,  3 which  label 
the  particles,  we  set  up  a one-one  correspondence  between  the  a’s  and 
the  particles,  so  that  given  any  permutation  Pa  applying  to  the  par- 
ticles, we  can  give  a meaning  to  the  same  permutation  P“  applying 
to  the  a’s.  This  meaning  is  such  that,  for  the  ket  \X)  given  by  (16), 

P%Pa\xy  = ix>.<  (i9) 

Since  the  various  kets  la1),  ja2),...  are  orthogonal,  |Jf>  and  PJA>  are 
orthogonal  unless  P = 1.  It  follows  that,  for  any  coefficients  cP, 

lcP<X\P°Pa\X>  ==  c„,  (20) 


provided  |X>  is  normalized,  the  summation  being  over  all  the  n! 
permutations  P or  P“,  with  Pa  fixed.  Now  define  VP  by 

VP  = <X|FP|A'>.  (21) 

We  then  have,  for  any  two  permutations  Pr  and  Py, 

<X\PxVPy\X)  = <X[VPxPv\X->  = VPzPv 
= ^VpiX\P«P,Pv\Xy 

with  the  help  of  (20).  From  (18)  this  gives 

<x\ PxVPv\Xy  = | Vp<X\PxP«Pv\X}.  (22) 

We  may  write  this  result  as 

V » ^ Vp  pa>  (23) 

where  the  sign  « means  an  equation  in  a restricted  sense,  the 
operators  on  the-  two  sides  being  equal  so  long  as  they  are  used  only 
with  kets  of  the  form  P|X>  and  their  conjugate  imaginary  bras. 

The  formula  (23)  shows  that  the  perturbing  energy  V is  equal,  in 
the  restricted  sense,  to  a linear  function  of  the  permutation  operators 
P“  with  coefficients  VP  given  by  (2 1 ).  The  restricted  sense  is  adequate 
for  the  calculation  of  t he  first-order  correction  in  the  energy -levels, 
as  this  calculation  involves  only  those  matrix  elements  of  V given  by 
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(22).  The  formula  (23)  is  a very  convenient  one  because  the  expression 
on  its  right-hand  side  is  easily  handled. 

As  an  example  of  an  application  of  (23)  we  shall  determine  the 
average  energy  of  all  those  states,  arising  from  the  unperturbed  state 
(16),  that  belong  to  one  exclusive  set.  This  requires  us  to  calculate 
the  average  eigenvalue  of  V for  those  states  (17)  for  which  the  X’s 
have  specified  numerical  values  x*  Now  the  average  eigenvalue  of 
P«  for  any  of  these  states  equals  that  of  P-P^P-)"1  for  arbitrary 
pi  and  thus  equals  that  of  rah1  2 P^P*)'1,  which  is  X'(P“)  or 

X(Pa ).  Hence  the  average  eigenvalue  of  V is  2 VpX(P)-  ^ simi  ar 

method  could  be  used  for  calculating  the  average  eigenvalue  of  any 
function  of  V,  it  being  necessary  only  to  replace  each  Pa  by  X'(P)  to 
perform  the  averaging. 

The  number  of  energy -levels  in  an  exclusive  set  X = x that  arlse 
from  a given  state  of  the  unperturbed  system  is  equal  to  the  number 
of  eigenvalues  of  the  right-hand  side  of  (23)  that  are  consistent  with 
the  equations  x = X This  number  is  the  number  »(*')  introduced 
at  the  end  of  the  preceding  section,  and  is  thus  just  the  degree  of 

degeneracy  of  the  states  in  this  set. 

We  have  assumed  that  the  individual  kets  |od>,  |x2),...  which  deter- 
mine the  unperturbed  state  according  to  (16)  are  all  orthogonal.  The 
theory  can  easily  be  extended  to  the  case  when  some  of  these  kets  are 
equal,  any  two  that  are  not  equal  being  still  restricted  to  be  orthogonal. 
We  now  have  some  permutations  Pa  such  that  Pa\X)  = |Ar>, 
namely  those  permutations  which  involve  only  interchanges  of 
equal  a’s.  Equation  (20)  will  now  hold  if  the  summation  is  extended 
only  over  those  P’s  which  make  Pa\X)  different.  With  this  change 
in  the  meaning  of  J,  all  the  previous  equations  still  hold,  including 

the  result  (23).  For  the  present  |X>  there  will  be  restrictions  on  the 
possible  numerical  values  of  the  X’s,  e.g.  they  cannot  have  those 
values  corresponding  to  |X>  being  antisymmetrical. 

58.  Application  to  electrons 

Let  us  consider  the  case  when  the  similar  particles  are  electrons. 
This  requires,  according  to  Pauli’s  exclusion  principle  discussed  in 
§ 64,  that  we  take  into  account  only  the  antisymmetric;,  si  Ties.  It 
is  now  necessary  to  make  explicit  reference  to  the  fact  that  electrons 
have  spins,  which  show  themselves  through  an  angular  momentum 
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and  a magnetic  moment.  The  effect  of  the  spin  on  the  motion  of 
an  electron  in  an  electromagnetic  field  is  not  very  great.  There 
are  additional  forces  on  the  electron  due  to  its  magnetic  moment, 
requiring  additional  terms  in  the  Hamiltonian.  The  spin  angular 
momentum  does  not  have  any  direct  action  on  the  motion,  hut  it  comes 
into  play  when  there  are  forces  tending  to  rotate  the  magnetic  moment, 
since  the  magnetic  moment  and  angular  momentum  are  constrained 
to  be  always  in  the  same  direction.  In  the  absence  of  a strong 
magnetic  field  these  effects  are  all  small,  of  the  same  order  of  magni- 
tude as  the  corrections  required  by  relativistic  mechanics,  and  there 
would  be  no  point  in  taking  them  into  account  in  a non -relativistic 
theory.  The  importance  of  the  spin  lies  not  in  these  small  effects  on  the 
motion  of  the  electron,  but  in  the  fact  that  it  gives  two  internal  states 
to  the  electron,  corresponding  to  the  two  possible  values  of  the  spin 
component  in  any  assigned  direction,  which  causes  a doubling  in  the 
number  of  independent  states  of  an  electron.  This  fact  has  far-reaching 
consequences  when  combined  with  Pauli’s  exclusion  principle. 

In  dealing  with  an  assembly  of  electrons  we  have  two  kinds  of 
dynamical  variables.  The  first  kind,  which  we  may  call  the  orbital 
variables,  consists  of  the  coordinates  *,  y,  z of  all  the  electrons  and 
their  conjugate  momenta  Vx,  pjn  pz.  The  second  kind  consists  of  the 
spin  variables,  the  variables  ax,  otJ,  oz,  as  introduced  in  § 37,  for  all 
the  electrons.  These  two  kinds  of  variables  belong  to  different  degrees 
of  freedom.  According  to  §§  20  and  21,  a ket  fixing  the  state  of  the 
whole  system  may  be  of  the  form  \A)  |J5>,  where  \A>  is  a ket  referring 
to  the  orbital  variables  alone  and  | B)  is  a ket  referring  to  the  spin 
variables  alone,  and  the  general  ket  fixing  a state  of  the  whole  system 
is  a sum  or  integral  of  kets  of  this  form.  This  way  of  looking  at  things 
enables  us  to  introduce  two  kinds  of  permutation  operators,  the  first 
kind,  Px  say,  applying  to  the  orbital  variables  only  and  operating 
only  on  the  factor  | A)  and  the  second  kind,  Pa  say,  applying  only 
to  the  spin  variables  and  operating  only  on  the  factor  |£?>.  The  Px’s 
and  P°’ s can  each  be  applied  to  any  ket  for  the  whole  system,  not 
merely  to  certain  special  kets,  like  the  P“’s  of  the  preceding  section. 
The  permutations  P that  we  have  had  up  to  the  present  apply  to  all 
the  dynamical  variables  of  the  particles  concerned,  so  for  electrons 
they  will  apply  to  both  the  orbital  and  the  spin  variables.  This  means 
that  each  Pa  equals  the  product 

Jp  — px  pa 
a ± a ± a* 


(24) 
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We  can  now  see  the  need  for  taking  the  spin  variables  into  account 
when  applying  Pauli’s  exclusion  principle,  even  if  we  neglect  the  spin 
forces  in  the  Hamiltonian.  For  any  state  occurring  in  nature  each 
Pa  must  have  the  value  d:  1 > according  to  whether  it  is  an  even  or 
an  odd  permutation,  so  from  (24) 

PaP°a=  ±1-  (25) 

The  theory  of  the  three  preceding  sections  would  become  trivial  if 
applied  directly  to  electrons,  for  which  each  Pa  = ±1.  We  may, 
however,  apply  it  to  the  Px  permutations  of  electrons.  The  PJ’s  are 
constants  of  the  motion  if  we  neglect  the  terms  in  the  Hamiltonian 
that  arise  from  the  spin  forces,  since  this  neglect  results  in  the 
Hamiltonian  not  involving  the  spin  dynamical,  variables  a at  all.  The 
Px' s must  then  also  be  constants  of  the  motion.  We  can  now  intro- 
duce new  y’s,  equal  to  the  average  of  all  of  the  Px’ s in  each  class,  and 
assert  that  for  any  permissible  set  of  numerical  values  y'  for  these  y’s 
there  will  be  one  exclusive  set  of  states.  Thus  there  exist  exclusive  sets 
of  states  for  systems  containing  many  electrons  even  when  we  restrict 
ourselves  to  a consideration  of  only  those  states  that  satisfy  Pauli’s 
principle.  The  exclusiveness  of  the  sets  of  states  is  now,  of  course, 
only  approximate,  since  the  y’s  &re  constants  only  so  long  as  we 
neglect  the  spin  forces.  There  will  actually  be  a small  probability  for 
a transition  from  a state  in  one  set  to  a state  in  another. 

Equation  (25)  gives  us  a simple  connexion  between  the  Px’s  and 
Pa’ s,  which  means  that  instead  of  studying  the  dynamical  variables 
Px  we  can  get  all  the  results  we  want,  e.g.  the  characters  y',  by 
studying  the  dynamical  variables  P°.  The  Pa’n  are  much  easier  to 
study  on  account  of  there  being  only  two  independent  states  of  spin 
for  each  electron.  This  fact  results  in  there  being  fewer  characters  y' 
for  the  group  of  permutations  of  the  o-variables  than  for  the  group 
of  general  permutations,  since  it  prevents  a ket  in  the  spin  variables 
from  being  antisymmetrical  in  more  than  two  of  them. 

The  study  of  the  Pa’s  is  made  specially  easy  by  the  fact  that  we 
can  express  them  as  algebraic  functions  of  the  dynamical  variables  a. 
Consider  the  quantity 

012  = + + = |{l+(oi,o2)}. 

With  the  help  of  equations  (50)  and  (51)  of  § 37  we  find  readily  that 

(°i>°2)2  = (0xiax2+0vi<JV2+az\azi)i  = 3— 2(c1,a2),  (26). 

and  hence  that 


^i!  — i{l  + 2(ax,  <x2)  + (°i>  °a)2}  = 1- 


(27) 
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Again,  we  find 

Ol$aTl  ~ Uarxl+Crx2—  ^1/10-2}- 

® V]  ‘>  — -2  + axl  -f-  i azi  ~ iaz\  ,7y2  ( 

and  hence  ^12^1  — ^2^12- 

Similar  relations  hold  for  ojlX  and  crcl  so  that  we  have 

^12  °1  — a2  @12 
°r  = o2. 

From  this  we  can  obtain  with  the  help  of  (27) 

012  o,  Ojg1  — Oj. 

these  commutation  relations  for  0X2  with  ax  and  o2  are  precisely  the 
same  as  those  for  Pf3,  the  permutation  consisting  of  the  interchange 
of  the  spin  variables  of  electrons  1 and  2.  Thus  we  can  put 

012  — cP[2, 

where  c is  a number.  Equation  (27)  shows  that  c — ±1.  To  deter- 
mine which  of  these  values  for  c is  the  correct  one,  we  observe  that 
the  eigenvalues  of  P^  are  1,  1,  1,  —1,  corresponding  to  the  fact  that 
there  exist  three  independent  symmetrical  and  one  antisymmetrical 
state  in  the  spin  variables  of  two  electrons,  namely,  with  the  notation 
of  §37,  the  states  represented  by  the  three  symmetrical  functions 
/ «(CTzi)/a(a=2)>  /a(CTsi)/p(^2)+/e(crli)/a(al2),  and  the  one 

antisymmetrical  function  Thus  the  mean 

of  the  eigenvalues  of  P"„  is  Now  the  mean  of  the  eigenvalues  of 
(®i>  au)  is  evidently  zero  and  hence  the  mean  of  the  eigenvalues  of  0n 
is  £.  Thus  we  must  have  c = -f 1,  and  so  we  can  put 

-Pi2  = H1 +(<*!,  o2)}.  (28) 

In  this  way  any  permutation  P°  consisting  simply  of  an  interchange 
can  be  expressed  as  an  algebraic  function  of  the  o’s.  Any  other  per- 
mutation Pa  can  be  expressed  as  a product  of  interchanges  and  can 
therefore  also  be  expressed  as  a function  of  the  o’s.  With  the  help*  of 
(25)  we  can  now  express  the  Px’s  as  algebraic  functions  of  the  o’s  and 
eliminate  the  P°’s  from  the  discussion.  We  have,  since  the  — sign 
must  be  taken  in  (25)  when  the  permutations  are  interchanges  and 
since  the  square  of  an  interchange  is  unity, 

= ~ JO  + ^i-Oa)}.  (29) 

The  formula  (29)  may  conveniently  be  used  for  the  evaluation  of 
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the  characters  \ which  define  the  exclusive  sets  of  states.  We  have, 
for  example,  for  the  permutations  consisting  of  interchanges, 


X12 


= -21 


2 («r.  O, 


a 


-n(n-l)r<t'~r'-‘  i 
If  we  introduce  the  dynamical  variable  s to  describe  the  magnitude  ot 
the  total  spin  angular  momentum,  J 2 ar in  units  of  tlirough  tlie 

formula  »(»+i)  = (!2«,.424 

\ r t • 


in  agreement  with  (39)  of  § 36,  we  have 

2 2 (Or,  Ot)  = (2  ®r>  2°/)-  2 (O,,  Or) 

r<t  ' r t ! t . 

= 4s(s+l)-3n.  v. 

Hence  , , , , 

lj  4s(s+l)-3«l »(»— 4)+4a(*H-l)  (30) 

-2\l-k-n(n-i)  2n(n— lj 

Thus  y12  is  expressible  as  a function  of  the  dynamical  variable  s and 
of  n the  number  of  electrons.  Any  of  the  otlier  y’s  could  be  evaluated 
on  similar  lines  and  would  have  to  be  a function  of  s and  n only,  since 
there  are  no  other  symmetrical  functions  of  all  the  a dynamical 
variables  which  could  be  involved.  There  is  therefore  one  set  of 
numerical  values  \ f°r  ^,e  X s>  an<^  ^fius  one  exclusive  set  of  states, 
for  each  eigenvalue  s’  of  s.  The  eigenvalues  of  s are 

\n,  \n—  1,  \n—  2 

the  series  terminating  with  0 or 

We  see  in  this  way  that  each  of  the  stationary  states  of  a system 
with  several  electrons  is  an  eigenstate  of  s,  the  magnitude  in  units  of 
h of  the  total  spin  angular  momentum  \ 2 °r>  belonging  to  a definite 

r 

eigenvalue  s’.  For  any  given  s'  there  will  be  2s' + 1 possible  values 
for  a component  of  the  total  spin  vector  in  any  direction  and  these 
will  correspond  to  2s'‘— (—  I independent  stationary  states  with  the  same 
energy.  When  we  do  not  neglect  the  forces  due  to  the  spin  magnetic 
moments  these  2s' +1  states  will  in  general  be  split  up  into  2s' + 1 
states  with  slightly  different  energies,  and  will  thus  form  a multiplet 
of  multiplicity  2s'  + 1 . Transitions  in  which  s'  changes,  i.e.  transitions 
from  one  multiplicity  to  another,  cannot  occur  when  the  spin  forces 
are  neglected  and  will  have  only  a small  probability  of  occurrence 
when  the  spin  forces  are  not  neglected. 
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We  can  determine  the  energy-levels  of  a system  with  several 
electrons  to  the  first  approximation  by  applying  the  theory  of  the 
preceding  section  with  the  kets  |ar>  referring  only  to  the  orbital 
variables  and  using  formula  (23).  If  we  consider  only  the  Coulomb 
forces  between  the  electrons,  then  the  interaction  energy  V will 
consist  of  a sum  of  parts  each  referring  to  only  two  electrons,  which 
will  result  in  all  the  matrix  elements  VP  vanishing  except  those  for 
w hich  is  the  identical  permutation  or  is  simply  an  interchange  of 

two  electrons.  Thus  (23)  will  reduce  to 

V « (31) 

VrB  being  the  matrix  element  referring  to  the  interchange  of  electrons 
r and  s.  Since  the  Pa’s  have  the  same  properties  as  the  Px’s,  any 
function  of  the  P“’s  will  have  the  same  eigenvalues  as  the  corre- 
sponding function  of  the  Px’s,  so  that  the  right-hand  side  of  (31) 
will  have  the  same  eigenvalues  as 

K + 2W, 

r<8 

0r  V1  -£2X{1  + (°r.«s)}  (32) 

from  (29).  The  eigenvalues  of  (32)  will  give  the  first-order  corrections 
in  the  energy -levels.  The  form  of  (32)  shows  that  a model  which 
assumes  a coupling  energy  between  the  spins  of  the  various  electrons, 
of  magnitude  — Pva(°r>  °8)  for  the  electrons  in  the  r and  s orbital 
states,  would  meet  with  a fair  amount  of  success.  This  ebupling 
energy  is  much  greater  than  that  of  the  spin  magnetic  moments.  Such 
models  of  the  atom  were  in  use  before  the  justification  by  quantum 
mechanics  was  obtained. 

We  may  have  two  of  the  orbital  states  of  the  unperturbed  system 
the  same,  i.e.  the  kets  |ar>  in  the  orbital  variables  for  two  electrons 
may  be  the  same.  Suppose  fa1)  and  |aa>  are  the  same.  Then  we  must 
1 a or!^  values  of  (31)  that  are  consistent  with  Pft  = 1, 

or  those  eigenvalues  of  (32)  that  are  consistent  with  Pf2  = l or 
—1-  From  (28)  this  condition  gives  (avo2)  = — 3,  so  that 
(°i+°2)a  = 0-  Thus  the  resultant  of  the  two  spins  ox  and  o2  is  zero, 
which  may  be  interpreted  as  the  spins  and  o2  being  antiparallel. 
Thus  we  may  say  that  two  electrons  in  the  same  orbital  state  have 
their  spins  antiparallel.  More  than  two  electrons  cannot  be  in  the 
same  orbital  state. 
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59.  An  assembly  of  bosons 

We  consider  a dynamical  system  composed  of  u'  similar  particles. 
We  set  up  a representation  for  one  of  the  particles  with  discrete  basic 
kets  |a(1)>,  |a(2)>,  la®) Then,  as  explained  in  § 54,  we  get  a sym- 

metrical representation  of  the  assembly  of  v?  particles  by  taking  as 
basic  kets  the  products 

in  which  there  is  one  factor  for  each  particle,  the  suffixes  1,  2,  3,...,  u' 
of  the  a’s  being  the  labels  of  the  particles  and  the  indices  a,  6,  c,...,  g 
denoting  indices  (1),  (2>,  ®,...  in  the  basic  kets  for  one  particle.  If  the 
particles  are  bosons,  so  that  only  symmetrical  states  occur  in  nature, 
then  we  need  to  work  with  only  the  symmetrical  kets  that  can  be 
constructed  from  the  kets  (1).  The  states  corresponding  to  these 
symmetrical  kets  will  form  a complete  set  of  states  for  the  assembly 
of  bosons.  We  can  build  up  a theory  of  them  as  follows. 

We  introduce  the  linear  operator  S defined  by 

S = «'!-‘IP,  (2) 

the  sum  being  taken  over  all  the  u' ! permutations  of  the  u'  particles. 
Then  S applied  to  any  ket  for  the  assembly  gives  a symmetrical  ket. 
We  may  therefore  call  S the  symmetrizing  operator.  From  (8)  of  § 55 
it  is  real.  Applied  to  the  ket  ( 1 ) it  gives 

u’H  2 P|a?agag...o».>  = S|a“aV...a<?>,  (3) 

the  labels  of  the  particles  being  omitted  on  the  right-hand  side  as 
they  are  no  longer  relevant.  The  ket  (3)  corresponds  to  a state  for 
the  assembly  of  u'  bosons  with  a definite  distribution  of  the  bosons 
among  the  various  boson  states,  without  any  particular  boson  being 
assigned  to  any  particular  state.  The  distribution  of  bosons  is  speci- 
fied if  we  specify  how  many  bosons  are  in  each  boson  state.  Let 
n2,  n3,...  be  the  numbers  of  bosons  in  the  states  a(1),  a(2),  a(3),... 
respectively  with  this  distribution.  The  n” s are  defined  algebraically 
by  the  equation 

a“+a6+ac+  ...-fa1'  = n[  a(1) -f n2  a<2> + ttj  a®  + . . . . (4) 

The  sum  of  the  n" s is  of  course  u’.  The  number  of  n" s is  equal  to 
the  number  of  basic  kets  |a*r*),  which  in  most  applications  of  the 
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theory  is  very  much  greater  than  u',  so  most  of  the  n” a will  be  zero. 
If  aa,  <xb,  of,...,  of  are  all  different,  i.e.  if  the  ri’s  are  all  0 or  1,  the 
ket  (3)  is  normalised,  since  in  this  ease  the  terms  on  the  left-hand 
side  of  (3)  are  all  orthogonal  to  one  another  and  each  contributes 
u !-1  to  the  squared  length  of  the  ket.  However,  if  aa,  of,  of,...,  of 
are  not  all  different,  those  terms  on  the  left-hand  side  of  (3)  will 
be  equal  which  arise  from  permutations  P which  merely  interchange 
bosons  in  the  same  state.  The  number  of  equal  terms  will  be 
ni ! n'i ! %!...,  so  the  squared  length  of  the  ket  (3)  will  be 

<aaofiof...oLi>\S2\ocaocbcf...oca)  = n[l  n’2\ n'3l... . (5) 

k or  dealing  with  a general  state  of  the  assembly  we  can  introduce 
the  numbers  nv  nt,  na,...  of  bosons  in  the  states  a(1>,  a<2), 
respectively  and  treat  the  n’s  as  dynamical  variables  or  as  observ- 
ables. They  have  the  eigenvalues  0,  1,  2,...,  u’.  The  ket  (3)  is  a 
simultaneous  eigenket  of  all  the  n’s,  belonging  to  the  eigenvalues 

ni>  7h*  na J’lie  various  kets  (3)  form  a complete  set  for  the 

dynamical  system  consisting  ol  u'  bosons,  so  the  n’s  all  commute 
(see  the  converse  to  the  theorem  of  § 13).  Further,  there  is  only  one 
independent  ket  (3)  belonging  to  any  set  of  eigenvalues  n[,  . 

Hence  the  n’s  form  a complete  set  of  commuting  observables.  If  wo 
normalize  the  kets  (3)  and  then  label  the  resulting  kets  by  the 
eigenvalues  of  the  n s to  which  they  belong,  i.e.  if  we  put 

(«;!«;'%!-)-!-SWa<'...v»>  = \ri1n2na...y,  (6) 

we  get  a set  of  kets  ibq  w.2w3...),  with  then°s  taking  on  all  non-negative 
integral  values  adding  up  to  u’,  which  kets  will  form  the  basic  kets 
of  a representation  with  the  n’s  diagonal. 

The  ns  can  be  expressed  as  functions  of  the  observables  acv  «4, 
aM<  which  define  the  basic  kets  of  the  individual  bosons  by 
means  of  the  equations 

W«  = (7) 

T 

or  the  equations  V »„/(«»)  = If^.)  (8) 

holding  for  any  function  /. 

Let  us  now'  suppose  that  the  number  of  bosons  in  the  assembly  is 
not  given,  but  is  variable.  This  number  is  then  a dynamical  variable 
or  observable  u,  with  eigenvalues  0,  1,  2,...,  and'uie  ket  (3)  is  an 
eigenket  of  u belonging  to  the  eigenvalue  u’.  To  get  a complete 
set  of  kets  for  our  dynamical  system  we  must  now  take  all  the 
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symmetrical  kets  (3)  for  all  values  of  u.  We  may  arrange  them  in 
order  thus  ^ ^ ^ (xa(Xby)  Sj*«aV>,  ....  '(«) 

where  first  is  written  the  ket,  with  no  label,  corresponding  to  the 
state  with  no  bosons  present,  then  come  the  kets  corresponding  to 
states  with  one  boson  present,  then  those  corresponding  to  states 
with  two  bosons,  and  so  on.  A general  state  corresponds  to  a ket 
which  is  a sum  of  the  various  kets  (9).  The  kets  (9)  are  all  orthogonal 
to  one  another,  two  kets  referring  to  the  same  number  of  bosons  being 
orthogonal  as  before,  and  two  referring  to  different  numbers  of  bosons 
being  orthogonal  since  the}'  are  eigenkets  of  u belonging  to  different 
eigenvalues.  By  normalizing  all  the  kets  (9),  we  get  a set  of  kets  like 
(fi)  with  no  restriction  on  the  n” s (i.e.  each  n'  taking  on  all  non- 
negative integral  values)  and  these  kets  form  the  basic  kets  of  a 
representation  with  the  w’s  diagonal  for  the  dynamical  system  con- 
sisting of  a variable  number  of  bosons. 

If  there  is  no  interaction  bet  ween  the  bosons  and  if  the  basic  kets 
j f - correspond  to  stationary  states  of  a boson,  the  kets  (9) 
will  correspond  to  stationary  states  for  the  assembly  of  bosons.  The 
number  u of  bosons  is  now  constant  in  time,  but  it  need  not  be  a 
specified  number,  i.e.  the  general  state  is  a superposition  of  states 
with  various  values  for  u.  If  the  energy  of  one  boson  is  H(<x),  the 
energy  of  the  assembly  will  be 

im°r)  = InaH*  (10) 


from  (8),  Ha  being  short  for  the  number  H(<xa).  This  gives  the 
Hamiltonian  for  the  assembly  as  a function  of  the  dynamical 
variables  n. 


60.  The  connexion  between  bosons  and  oscillators 

In  § 34  we  studied  the  harmonic  oscillator,  a dynamical  system  of 
one  degree  of  freedom  describable  in  terms  of  a canonical  q and  p, 
such  that  the  Hamiltonian  is  a sum  of  squares  of  q and  p,  with 
numerical  coefficients.  We  define  a general  oscillator  mathematically 
as  a system  of  one  degree  of  freedom  describable  in  terms  of  a 
canonical  q and  p,  such  that  die  Hamiltonian  is  a power  series  in  q 
and  p,  arid  remains  so  if  the  system  is  perturbed  in  any  way.  \\  e 
shall  now  study  a dynamical  system  composed  of  several  of  these 
oscillators.  We  can  describe  each  oscillator  in  terms  of,  instead  of 
q and  p,  a complex  dynamical  variable  q,  like  the  -q  of  § 34,  and  its 
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conjugate  complex  rj,  satisfying  the  commutation  relation  (7)  of 
§ 34.  We  attach  labels  1,  2,  3,...  to  the  different  oscillators,  so  that 
the  whole  set  of  oscillators  is  describable  in  terms  of  the  dynamical 
variables  r jx,  ij2,  ijv  rja,  rja>...  satisfying  the  commutation 

relations 


Put 
so  that 


Va  Vb~ 

~VbVa  = 

0>  1 

Va  Tib- 

~  Vb  Va  = 

°’  ! 

I (ii) 

ia  Vb~ 

-VbVo  = 

So*.  J 

1 

lain  = 

na> 

(12) 

VaVa  = 

«a  + l. 

(13) 

The  n’s  are  observables  which  commute  with  one  another  and  the 
work  of  § 34  shows  that  each  of  them  has  as  eigenvalues  all  non- 
negative  integers.  For  the  ath  oscillator  there  is  a standard  ket  for 
the  Fock  representation,  |0U)  say,  which  is  a normalized  eigenket  of  na 
belonging  to  the  eigenvalue  zero.  By  multiplying  all  these  standard 


kets  together  we  get  a standard  ket  for  the  Fock  representation  for 


the  set  of  oscillators, 


|01>|02>|03>..., 


(14) 


which  is  a simultaneous  eigenket  of  all  the  n' s belonging  to  the 
eigenvalues  zero.  We  shall  denote  it  simply  by  |0>.  From  ( 1 3)  of  § 34 


f?J0>  = 0 (15) 

for  any  a.  The  work  of  §34  also  shows  that,  if  n[,  n2,  n'3,...  are  any 
non-negative  integers,  (16) 

is  a simultaneous  eigenket  of  all  the  n’ s belonging  to  the  eigenvalues 
nl,  n2,  respectively.  The  various  kets  (16)  obtained  by  taking 
different  n" s form  a complete  set  of  kets  all  orthogonal  to  one  another 
and  the  square  of  the  length  of  one  of  them  is,  from  (16)  of  § 34, 

njlwjlw'j!. From  this  we  see,  bearing  in  mind  the  result  (5),  that 

the  kets  (16)  have  just  the  same  properties  as  the  kets  (9),  so  that 
we  can  equate  each  ket  (16)  to  the  ket  (9)  referring  to  the  same  n' 
values  without  getting  any  inconsistency.  This  involves  putting 

S\<xna.b<xc...a«')  = r?„r7,,7?c...7?!Z|()>.  (17) 

The  standard  ket  )0>  becomes  equal  to  the  first  of  the  kets  (9),  corre- 

sponding to  no  bosons  present. 

The  effect  of  equation  (17)  is  to  identify  the  states  of  an  assembly 
of  bosons  with  the  states  of  a set  of  oscillators.  This  means  that  the 
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dynamical  system  consisting  of  an  assembly  of  similar  bosons  is  equiva- 
lent to  the  dynamical  system  consisting  of  a set  of  oscillators — the  two 
systems  are  just  the  same  system  looked  at  from  two  different  points  of 
view.  There  is  one  oscillator  associated  with  each  independent  boson 
state.  We  have  here  one  of  the  most  fundamental  results  of  quantum 
mechanics,  which  enables  a unification  of  the  wave  and  corpuscular 
theories  of  light  to  be  effected. 

Our  work  in  the  preceding  section  was  built  up  on  a discrete  set 
of  basic  kets  |<x°)  for  a boson.  We  could  pass  to  a different  discrete 
set  of  basic  kets,  \fiA)  say,  and  build  up  a similar  theory  on  them. 
The  basic  kets  for  the  assembly  would  then  be,  instead  of  (9), 


|>,  S\0A  £*>,  S \PAPBPC>,  ....  , (18) 

The  first  of  the  kets  (18),  referring  to  no  bosons  present,  is  the  same 
as  the  first,  of  the  kets  (9).  Those  kets  (18)  referring  to  one  boson 
present  are  linear  functions  of  those  kets  (9)  refemng  to  one  boson 
present,  namely  j/jd>  = y (19) 

t* 

and  generally  those  kets  (18)  referring  to  u'  bosons  present  are  linear 
functions  of  those  kets  (9)  referring  to  u’  bosons  present.  Associated 
with  the  new  basic  states  for  a boson  there  will  be  a new  set 
of  oscillator  variables  yA,  and  corresponding  to  (17)  we  shall  have 


S\fi-‘Pnj3,!...)  = yA  yH  j?c...  !<)>.  (20) 

Thus  a ket  yAy]}... j0>  with  u’  factors  yA,  yn....  must  be  a linear  func- 
tion of  kets  rj„r]b...  !0>  with  u’  factors  r?„,  ijb It  follows  that  each 

linear  operator  y A must  be  a linear  function  of  the  yfs.  Equation 

a 

and  hence  Va  = J %<a«  \p*>.  (21) 


Thus  the  rfs  transform  according  to  the  same  law  as  the  basic  kets  for 
a boson.  The  transformed  rfs  satisfy,  with  their  conjugate  complexes, 
the  same  commutation  relations  (1 1)  as  the  original  ones.  The  trans- 
formed y's  are  on  just  the  same  footing  as  the  original  ones  and  hence, 
when  we  look  upon  our  dynamical  system  as  a set  of  oscillators,  the 
different  degrees  of  freedom  have  no  invariant  significance. 

The  rj’s  transform  according  to  the  same  law  as  the  basic  bras  for 
a boson,  and  thus  the  same  law  as  the  numbers  <ct"  \x)  forming  the 
representative  of  a state  x.  This  similarity  people  often  describe  by 

3595.57  „ 
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saying  that  the  rja’s  are  given  by  a process  of  second  quantization 
applied  to  <a“|z>,  meaning  thereby  that,  after  one  has  set  up  a 
quantum  theory  for  a single  particle  and  so  introduced  the  numbers 
(aca  |rc>  representing  a state  of  the  particle,  one  can  make  these  num- 
bers into  linear  operators  satisfying  with  their  conjugate  complexes 
the  correct  commutation  relations,  like  (11),  and  one  then  has  the 
appropriate  mathematical  basis  for  dealing  with  an  assembly  of  the 
particles,  provided  they  are  bosons.  There  is  a corresponding  proce- 
dure for  fermions,  which  will  be  given  in  § 65. 

Since  an  assembly  of  bosons  is  the  same  as  a set  of  oscillators,  it 
must  be  possible  to  express  any  symmetrical  function  of  the  boson 
variables  in  terms  of  the  oscillator  variables  •»?  and  rj.  An  example 
of  this  is  provided  by  equation  (10)  with  r]aija  substituted  for  na. 
Let  us  see  how  it  goes  in  general.  Take  first  the  case  of  a function 
of  the  boson  variables  of  the  form 

ut  = 2 U„  (22) 

where  each  Ur  is  a function  only  of  the  dynamical  variables  of  the 
rth  boson,  so  that  it  has  a representative  <a“|f/r|c^>  referring  to  the 
basic  kets  |a£>  of  the  rth  boson.  In  order  that  UT  may  be  symmetrical, 
this  representative  must  be  the  same  for  all  r,  so  that  it  can  depend 
only  on  the  two  eigenvalues  labelled  by  a and  b.  We  may  therefore 
write  it  <a“|(7rja£>  = <a«jf7|a»>  = (a\U\b) 

for  brevity.  We  have 

(7r|af*af»...>  = 2 |af1af»..a“..)<a|17|a;r>. 

a 

Summing  this  equation  for  all  values  of  r and  applying  the  sym- 
metrizing operator  8 to  both  sides,  we  get 

•Sf/ylafiof*...)  = ^ 2 (25) 

r a ' 

Since  UT  is  symmetrical  we  can  replace  SUT  by  UTS  and  can  then 
substitute  for  the  symmetrical  kets  in  (25)  their  values  given  by  (17). 
We  get  in  this  way 

UtVx  I’ix.-I0)  = 22  VaVl'Vx:  'nXi-\(>'>(a\U\xr') 

a r 

= 2 Val,  Vx,1  VxtVxt-fiySbxA^Ulb},  (26) 

ab  r 

Vxr1  meaning  that  the  factor  rjXr  must  be  cancelled  out.  Now  from 
(15)  and  the  commutation  relations  (11) 

Vb  7x,  ^-|0>  = 2 Vxr1  Vx,  Vx2-\ o>V 


(23) 

(24) 


(27) 
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(note  that  rjb  is  like  the  operator  of  partial  differentiation  d/dr)b),  so 
(26)  becomes 


UtVx,Vx,-  l°>  = IVaVbVx.Vx,- l0)<a|^l^>-  (28) 

ab 


The  kets  rjXl  rjXt...  |0)  form  a complete  set,  and  hence  we  can  infer  from 
(28)  the  operator  equation 

UT  = Zv*<*\U\b)vb.  (29) 

ab 

This  gives  us  Urp  in  terms  of  the  t)  and  variables  and  the  matrix 
elements  <a|U|6>. 

Now  let  us  take  a symmetrical  function  of  the  boson  variables 
consisting  of  a sum  of  terms  each  referring  to  two  bosons, 

VT  = I Fr,  , ' (30) 

r,8#r  "* 

We  do  not  need  to  assume  VTS  — V„.  Corresponding  to  (23),  Vrs  has 
matrix  elements  = <ab\V\cd}  (31) 

for  brevity.  Proceeding  as  before  we  get,  corresponding  to  (25), 

SVT\c? a?...)  = 2 T .^..><a6|F|xr.rg>  (32) 

r,ss*r  ab 


and  corresponding  to  (26) 

= 2 VaVb  2 ■nxr1Vx,1Vxtr)x,--\0>ScxrSdX,<ab  \v\cd>-  (33) 

abed  r,s  ^ r 

We  can  deduce  as  an  extension  of  (27) 

VcVdVxiVxi--- 1 9)  = 2 VlVVx,  ^.-l()>s«rW  (34) 

r,s  / r 

so  that  (33)  becomes 

VtVxxVx,-  1°)  = 2 VaVbVc;ndyix,Vx2-\()'><.ab\v\cd'>’ 

abed 


giving  us  the  operator  equation 

VT  = J,  VaVb<ab\v\cd>VcVd-  (35) 

abed 

The  method  can  readily  be  extended  to  give  any  symmetrical  func- 
tion of  the  boson  variables  in  terms  of  the  rj’s  and  ifs. 

The  foregoing  theory  can  easily  be  generalized  to  apply  to  an 
assembly  of  bosons  in  interaction  with  some  other  dynamical  system, 
which  we  shall  call  for  definiteness  the  atom.  We  must  introduce  a 
set  of  basic  kets,  |D  say,  for  the  atom  alone.  We  can  then  get  a set 
of  basic  kets  for  the  whole  system  of  atom  and  bosons  together  by 
multiplying  each  of  the  kets  into  each  of  the  kets  (9).  We  may 
write  these  kets 

|D,  ID“>,  S\i'<xaocbopy,  .... 


(36) 
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atom  variables  and  the  oscillator  °f  th° 

standard  tot  |0>  for  the  set  of  oscillators,  we  have  “*  ***“ 

** ! £ &a<xf/(xe. . . > 70  7j  i ON  I /'\ 

corresponding  to  H7)  o«  +u  4 . c /1^/j  (37) 

(36)  in  terms  of  the  oscillator  variables"1  “P”®™8  tte  basio  kete 

tymllXrlfalurbr**'8  ““  b°S°n  »«*  “ 

atom  variables  m,7the  ** * *”"«“>  »f 

the  form  (22)  with  V a f Lrfi  , °lnsider  first  a fu«ction  f/r  of 

variables  of  the  Hh  boson  w tint  it  h * ° ^ ***”"  VariabIes  and  the 
This  representative  .nusiV'in^ndeT?0”^6 
he  symmetrical  between  all  Z " “ that  ^ ™.V 

<£  «rl4IC  «r>  = <£  0t°|t7j£V>  = <n<aji/|6»^"’>  /38) 

orrespon  mg  to  (23).  The  equations  (24)-f28)  can  now  be  L 1 
and  applied  to  the  present  work  ;n  - t be  taken  over 

arc  multiplied  by  IC)  on  the  p it  ° S.I(leS  ot  aI1  t!lese  equations 
-tin  holds1  Wo  ^deltdimt  form, da  <29) 

the  form  (30)  with  V a fun< f , f sjmmetrical  function  VT  of 

variables  of  th^nlTr  Y r **  atom  va™bk*  «dtte 

function  oftioVtl  ^,  T"  ****«  to  be  that 

e atoln  variables  whose  representative  is 

, , <£'«?<*! 
e find  that  formula  (35)  still  holds. 

61.  Emission  and  absorption  of  bosons 

harmonic  *»  Tiding  section  are 

— - ™ The 

w . „ Ha  ~ fiajaVaVa  + iflUa. 

oscidito ”tt:l7atntt5r''  :hiCfc  “ *“  of  the 

neglect  does  not  have  Inv  Z ‘“""pota.  energy-.  This  • 

the  beginning  of  S 30  H ynam,caI  consequences,  as  explained  at 

- ^ 

1[T  — 2 //„  = 2 r]ava~  2 


(39) 
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with  the  help  of  (12).  This  is  of  the  same  form  as  (10),  with  hwa  for 
Ha.  Thus  a set  of  harmonic  oscillators  is  equivalent  to  an  assembly  cf 
bosons  in  stationary  states  with  no  interaction  bettveen  them.  If  an 
oscillator  of  the  set  is  in  its  nth,  quantum  stale,  there  are  n bosons  in 
the  associated  boson  state. 

In  general  the  Hamiltonian  for  the  set  of  oscillators  will  be  a power 
series  in  the  variables  ya,  rja,  say 

Ht  = Hp  + V (UaVa+Uafjn)+  2'  (UabyaIjbi-Vabqar]b+\ilbijarjb)-\-..., 

a ab 


where  HP,  U„,  Uab,  Vab  are  numbers,  HP  being  real  and  Uah  — Uba 


(40) 
If 

1'  ’ ' U ' ’ WJ'  UU  '■»  — ' 

the  set  of  oscillators  are  in  interaction  with  an  atom,  as  we  had  at 
the  end  of  the  preceding  section,  the  total  Hamiltonian  will  still  be 
of  the  form  (40),  with  II P,  Ua,  Uab.  Vub  funct  ion.Vof  the  atom  variables, 
HP  in  particular  being  the  Hamiltonian  for  the  atom  by  itself.  A 
general  treatment  of  this  dynamical  system  would  be  rather  compli- 
cated and  for  practical  applications  one  assumes  that  the  terms 

Up  T ' 2 T'aa  Va  Va  (4  ^ 

a 

are  large  compared  with  the  others  and  form  by  themselves  an 
unperturbed  system,  the  remaining  terms  being  taken  into  account 
as  a perturbation  producing  transitions  in  the  unperturbed  system, 
according  to  the  theory  of  § 44.  If,  further,  Uaa  is  independent  of  the 
atom  variables,  the  unperturbed  system  with  Hamiltonian  (41)  con- 
sists merely  of  an  atom  with  Hamiltonian  IIP  and  an  assembly  of 
bosons  in  stationary  states  with  Hamiltonian  of  the  form  (39),  with 
no  interaction. 

Let  us  consider  what  kinds  of  transitions  are  produced  by  the 
various  perturbation  terms  in  (40).  Take  a stationary  state  of  the 
unperturbed  system  for  which  the  atom  is  in  a stationary  state,  £ say , 
and  bosons  are  present  in  the  stationary  boson  states,  a,  b,  c,...  ■ this 
stationary  state  for  the  unperturbed  system  corresponds  to  the  ket 

VaVhVe'-\°>\£>> 

like  (37).  If  the  term  Ux  y c of  (40)  is  multiplied  into  this  ket,  the 
result  is  a linear  combination  of  kets  like 

Vxi„  V/.’/r--  " O,  (43) 

£"  denoting  any  stationary  state  of  the  atom.  The  ket  (43)  refers  to 
one  more  boson  than  the  ket  (42),  the  extra  boson  being  in  the  state  x. 
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Ihus  the  perturbation  term  Uxrjx  gives  rise  to  transitions  in  which 
one  boson  is  emitted  into  state  x and  the  atom  makes  an  arbitrary 
jump.  If  the  term  Uxrjx  of  (40)  is  multiplied  into  (42),  the  result  is 
zero  unless  (42)  contains  a factor  and  is  then  a linear  combination 


of  kets  like 


Vx  lVa  %’?c-|0>ID> 


referring  to  one  boson  less  in  state  x.  Thus  the  perturbation  term 
Ux  Vx  gives  rise  to  transitions  in  which  one  boson  is  absorbed  from 
state  x,  the  atom  again  making  an  arbitrary  jump.  Similarly,  we  find 
that  a perturbation  term  Uxy  -qx  rjy  (x  =£  y)  gives  rise  to  processes  in 
which  a boson  is  absorbed  from  state  y and  one  is  emitted  into  state 
x,  or,  what  is  the  same  thing  physically,  one  boson  makes  a transition 
from  state  y to  state  x.  This  kind  of  process  would  be  produced  by 
a term  like  the  UT  of  (22)  and  (29)  in  the  perturbation  energy,  pro- 
vided the  diagonal  elements  (a\ U\ay  vanish.  Again,  the  perturbation 
terms  Vxy  -qx  rjy,  rjx  ijy  give  rise  to  processes  in  which  two  bosons  are 
emitted  or  absorbed,  and  so  on  for  more  complicated  terms.  With 
any  of  these  emission  and  absorption  processes  the  atom  can  make 
an  arbitrary  jump. 

Let  us  determine  how  the  probability  of  occurrence  of  each  of  these 
transition  processes  depends  on  the  numbers  of  bosons  originally 
present  in  the  various  boson  states.  From  §§  44,  46  the  transition 
probability  is  always  proportional  to  the  square  of  the  modulus  of 
the  matrix  element  of  the  perturbation  energy  referring  to  the  two 
states  concerned.  Thus  the  probability  of  a boson  being  emitted  into 
state  x with  the  atom  making  a jump  from  state  £'  to  state  £'  is 
proportional  to 


I <n<ni»2..(wi+l)..|  Ux  r]x\ n{  ni..n’x..y  |£'>l2,  (44) 

the  n" s being  the  numbers  of  bosons  initially  present  in  the  various 
boson  states.  Now  from  (6)  and  (17),  with  reference  to  (4), 

\n1n2n3...y  = (n\ ! n'2\  rfel  ij5’...|0),  (45) 

so  that  = {n'x+l)*\rtln'2..{n'x+\)..y.  (46) 

, Hence  (44)  is  equal  to 

Krt-l)|<miDI2,  (47) 


showing  that  the  probability  of  a transition  in  which  a boson  is  emitted 
into  state  x is  proportional  to  the  number  of  bosons  originally  in  state  x 
plus  one. 
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The  probability  of  a boson  being  absorbed  from  state  x with  the 
atom  making  a jump  from  state  £'  to  state  tf  is  proportional  to 

|<n<»i^..(n;-l)..|^^|«i^-4->ir>|2>  (48) 

the  n" s again  being  the  numbers  of  bosons  initially  present  in  the 
various  boson  states.  Now  from  (45) 


fjx\n'1n'2..n'x..y  = n'f\n\n'2..(n'x—\).:y,  (49) 

so  (48)  is  equal  to  n'x\(t,"\Ux\t,'y\*.  (50) 

Thus  the  probability  of  a transition  in  which  a boson  is  absorbed  from 
state  x is  proportional  to  the  number  of  bosons  originally  in  state  x. 

Similar  methods  may  be  applied  to  more  complicated  processes, 
and  show  that  the  probability  of  a process  in  which  a boson  makes 
a transition  from  state  y to  state  x (x  ^ y)  is  proportional  to  n'v{rix-\- 1 ) . 
More  generally,  the  probability  of  a process’  in  which  bosons  are 
absorbed  from  states  x,  y,...  and  emitted  into  states  a,  b,...  is  propor- 
tional to  n'xn'v...(n’a+  1)K+1)...,  (51) 

the  n”s  being  in  each  case  the  numbers  of  bosons  originally  present. 
These  results  hold  both  for  direct  transition  processes  and  transition 
processes  that  take  place  through  one  or  more  intermediate  states, 
in  accordance  with  the  interpretation  given  at  the  end  of  § 44. 


62.  Application  to  photons 

Since  photons  are  bosons,  the  foregoing  theory  can  be  applied  to 
them.  A photon  is  in  a stationary  state  when  it  is  in  an  eigenstate 
of  momentum.  It  then  has  two  independent  states  of  polarization, 
which  may  be  taken  to  be  two  perpendicular  states  of  linear  polariza- 
tion. The  dynamical  variables  needed  to  describe  the  stationary 
states  are  then  the  momentum  p,  a vector,  and  a polarization  variable 
1,  consisting  of  a unit  vector  perpendicular  to  p.  The  variables  p and 
I take  the  place  of  our  previous  a’s.  The  eigenvalues  of  p consist  of 
all  numbers  from  — oo  to  oo  for  each  of  the  three  Cartesian  com- 
ponents of  p,  while  for  each  eigenvalue  p'  of  p,  1 has  just  two 
eigenvalues,  namely  two  arbitrarily  chosen  vectors  perpendicular 
to  p'  and  to  one  another.  Owing  to  the  eigenvalues  of  p forming 
a continuous  range,  there  are  a continuous  range  of  stationary 
states,  giving  us  the  continuous  basic  kets  |pT>.  However,  the  fore- 
going theory  was  built  up  in  terms  of  discrete  basic  kets  |a'>  for  a 
boson.  There  are  two  formalisms  which  one  may  use  for  getting  over 
this  discrepancy. 
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The  first  consists  in  replacing  the  continuous  three-dimensional 
distribution  of  eigenvalues  for  p by  a large  number  of  discrete  points 
lying  very  close  together,  forming  a dust  spread  over  the  whole  three- 
dimensional  p-space.  Let  *y  be  the  density  of  the  dust  (the  number 
of  points  per  unit  volume)  in  the  neighbourhood  of  anv  point  p'. 
Then  <sp<  must  be  large  and  positive,  but  is  otherwise  an  arbitrary 
function  of  p'.  An  integral  over  the  p-space  may  be  replaced  by  a 
sum  over  the  dust  of  points,  in  accordance  with  the  formula 


/// Z(P')  dpxdpydp'z  = 2/(p')v\ 


(52) 

which  formula  provides  the  basis  of  the  passage  from  continuous  p' 
values  to  discrete  ones  and  vice  versa.  Any  problem  can  be  worked 
out  in  terms  of  the  discrete  p'  values,  for  which  the  theory  of§§  59-61 
can  be  used,  and  the  results  can  be  transformed  back  to  refer  to  con- 
tinuous p'  values.  The  arbitrary  density  sp.  should  then  disappear 
from  the  results. 

Ihe  second  formalism  consists  in  modifying  the  equations  of  the 
theory  of  §§  59-61  so  as  to  make  them  apply  to  the  case  of  a con- 
tinuous range  of  basic  kets  |a'>,  by  replacing  sums  by  integrals  and 
replacing  the  S symbol  in  the  commutation  relations  (11)  by  S func- 
tions, so  far  as  concerns  the  variables  with  continuous  eigenvalues. 
Each  of  these  formalisms  has  some  advantages  and  some  disadvan- 
tages. The  first  is  usually  more  convenient  for  physical  discussion, 
the  second  for  mathematical  development.  Both  will  be  developed 
here  and  one  or  other  will  be  used  according  to  which  is  more  suitable 
at  the  moment. 

The  Hamiltonian  describing  an  assembly  of  photons  interacting 
with  an  atom  will  be  of  the  general  form  (40),  with  the  coefficients 

^a>  f'afc!  hjo  involving  the  atom  variables.  This  Hamiltonian  may 
be  written  rr 

Ht  = HP-\-HQ-\-Hn,  (53) 

where  HP  is  the  energy  of  the  atom  alone,  II R is  the  energy  of  the 
assembly  of  photons  alone, 


Hr  = T Vi  AiV>  (54) 

P'l' 

iy  being  the  frequency  of  a photon  of  momentum  p',  and  IIQ  is  the 
interaction  energy,  which  can  be  evaluated  from  analogy  with  the 
classical  theory,  as  will  be  shown  in  the  next  section.  The  whole 
system  can  be  treated  by  a perturbation  method  as  discussed  in  the 
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preceding  section,  HP  and  H 
unperturbed  system  and  ffQ  l 
gives  rise  to  transition  process 
absorbed  and  the  atom  jumps 
We  saw  in  the  preceding  sect 
tion  process  is  proportional  to  1 
state  from  which  a boson  is  al 
the  probability  of  a photon  bei 
incident  on  an  atom  is  propoi 
We  also  saw  that  the  probabil 
tional  to  the  number  of  bosons 
one.  To  interpret  this  result  v 
relations  involved  in  replacing  t 
by  a discrete  set. 

Let  us  neglect  for  the  presc 
|p'D)  be  the  normalized  ket  c 
state  p'.  Then  from  (22)  of  § li 

2 IP'd 

P' 

which  gives  from  (52) 

J Ip'd><p 

d3p'  being  written  for  dp'J.  dp'ydp 
ket  corresponding  to  the  contin 
(24)  of  §16 

J Ip'X 

which  show's,  on  comparison  wii 

iP'>  = 

The  connexion  between  jp'y  and 
the  basic  kets  when  one  changes  1 
tion,  as  shown  by  (38)  of  § 16. 

With  rip.  photons  in  each  d 
density  p for  the  assembly  of  pi: 

p = 2 !P'l»>p-\P't)l 

p' 

= / P ;'p  p'  d'3/,' 
with  the  help  of  (56).  The  numb 
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neighbourhood  of  any  point  x'  is  then  <x'|pjx'>,  according  to  (73) 
of  § 33.  From  (57)  this  equals 


<x7ip|x'>  = | <x'|p>p.<p'|x'>  dy 
= J h~3n'p.  dy 


(58) 


if  one  puts  in  the  value  of  the  transformation  function  <x'|p'>  given 
by  (54)  of  § 23.  Equation  (58)  expresses  the  number  of  photons  per 
unit  volume  as  an  integral  over  the  momentum  space,  so  the  inte- 
grand in  (58)  can  be  interpreted  as  the  number  of  photons  per  unit 
of  phase  space.  We  obtain  in  this  way  the  result  that  the  number  of 
photons  per  unit  of  phase  space  is  equal  to  h~3  times  the  number  of 
photons  per  discrete  state,  in  other  words,  a cell  of  volume  h3  in  phase 
space  is  equivalent  to  a discrete  state.  This  result  is  a general  one, 
holding  for  any  kind  of  particle.  If  the  polarization  variable  of  the 
photons  is  not  neglected,  the  result  holds  for  each  of  the  two  indepen- 
dent states  of  polarization. 

The  momentum  of  a photon  of  frequency  v is  of  magnitude  hv/c, 
so  the  element  of  momentum  space 


dpxdpydpz  — h3c-3v2dvdto, 

doi  being  an  element  of  solid  angle  for  the  direction  of  the  vector  p. 
Thus  a distribution  of  photons  with  np  per  discrete  state,  which  is 
equivalent  to  a distribution  of  h-3n'pd3pd3x  photons  in  an  element 
of  volume  d3x  and  an  element  of  momentum  space  d3p,  equals  a 
distribution  of  n'pc~3v2  dvdwd3x  photons  in  an  element  of  volume  d3x 
and  a frequency  range  dv  ana  direction  of  motion  du>.  This  corre- 
sponds to  an  energy  density  np  V per  unit  solid  angle  per  unit 
frequency  range,  or  an  intensity  per  unit  frequency  range  (i.e.  an 
energy  crossing  unit  area  per  unit  time  per  unit  frequency  range)  of 
amount  . Wph^/c,,  (w) 


The  result  that  the  probability  of  a photon  being  emitted  is  pro- 
portional to  ?ip, -|- 1,  n’pl  being  the  number  of  photons  initially  present 
in  the  discrete  state  concerned,  can  now  be  interpreted  as  the  proba- 
bility being  proportional  to  Ivl-\-hi?/c2,  where  is  the  intensity  of 
the  incident  radiation  per  unit  frequency  range  in  the  neighbourhood 
of  the  frequency  of  the  emitted  photon  and  having  the  same  polariza- 
tion 1 as  the  emitted  photon.  Thus  with  no  incident  radiation  there 
is  still  a certain  amount  of  emission,  but  the  emission  is  increased  or 
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stimulated  by  incident  radiation  in  the  same  direction  and  having  the 
same  frequency  and  polarization  as  the  emitted  radiation.  The 
present  theory  of  radiation  thus  completes  the  imperfect  one  of  § 45 
by  giving  both  stimulated  and  spontaneous  emission.  The  ratio  it 
gives  for  the  two  kinds  of  emission,  namely  Ivl : hv3/c2,  is  in  agreement 
with  that  provided  by  Einstein’s  theory  of  statistical  equilibrium 
mentioned  in  § 45. 

The  probability  of  a photon  being  scattered  from  the  state  pT  to 
the  state  p"l"  is  proportional  to  wpT(np.j.-j- 1),  the  n’s  being  the 
numbers  of  photons  initially  in  the  discrete  states  concerned.  We  can 
interpret  this  result  as  the  probability  being  proportional  to 

imW+^"3Ic2).  ; (60) 

Similarly  for  a more  general  radiative  process  in  which  several 
photons  are  emitted  and  absorbed,  the  probability  is  proportional 
to  a factor  Ivl  for  each  absorbed  photon  and  a factor  Ivi-\-lu^jc2  for 
each  emitted  photon.  Thus  the  process  is  stimulated  by  incident 
radiation  in  the  same  direction  and  with  the  same  frequency  and 
polarization  as  any  of  the  emitted  photons. 

63.  The  interaction  energy  between  photons  and  an  atom 

We  shall  now  determine  the  interaction  energy  between  an  atom 
and  an  assembly  of  photons,  i.e.  the  HQ  of  equation  (53),  from 
analogy  with  the  classical  expression  for  the  interaction  energy 
between  an  atom  and  a field  of  radiation.  For  simplicity  we  shall 
suppose  the  atom  to  consist  of  a single  electron  moving  in  an  electro- 
static field  of  force.  The  field  of  radiation  may  be  described  by  a 
scalar  and  a vector  potential.  These  potentials  are  to  a certain  extent 
arbitrary  and  may  be  chosen  so  that  the  scalar  potential  vanishes. 
The  field  is  then  completely  described  by  the  vector  potential  Ax,  Av, 
Az,  or  A.  The  change  that  the  field  causes  in  the  Hamiltonian 
describing  the  atom  is  now,  as  explained  at  the  beginning  of  § 41, 

H° = i(KA)H  - a>+2^a’-  <6i> 

This  is  the  classical  interaction  energy.  The  A that  occurs  here  should 
be  the  value  of  the  vector  potential  at  the  point  where  the  electron  is 
momentarily  situated.  It  is,  however,  a good  enough  approximation 
if  we  take  this  A to  be  the  vector  potential  at  some  fixed  point  in  the 
atom,  such  as  the  nucleus,  provided  we  are  dealing  with  radiation 
whose  wavelength  is  large  compared  with  the  dimensions  of  the  atom. 
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Let  us  first  consider  the  field  of  radiation  classically  and  ignore  its 
interaction  with  the  atom.  The  vector  potential  A satisfies,  according 
to  Maxwell’s  theoVy,  the  equations 

□ A = 0,  divA  = 0,  (62) 

□ being  short  for  82lc28t2 — 82/8x2 — 82/8y2 — 82/8z2.  The  first  of  these 
equations  shows  that  A can  be  resolved  into  Fourier  components  in 
the  form 

A = [ (Ak  ei(kx)-2ir»Vk<}  d2k,  (63) 

each  Fourier  component  representing  a train  of  waves  moving  with 
the  velocity  of  light,  described  by  a vector  k whose  direction  gives 
the  direction  of  motion  of  the  waves  and  whose  magnitude  |k|  is 
connected  with  their  frequency  vk  by 

2 TTvk  = c|k|.  (64) 

The  vector  k is  just  the  momentum  of  a photon  which  the  quantum 
theory  would  associate  with  these  waves,  divided  by  h.  For  each 
value  of  k we  have  an  amplitude  Ak,  which  is  in  general  a complex 
vector,  and  the  integral  in  (63)  extends  over  the  whole  of  the  three- 
dimensional  k-space.  The  second  of  equations  (62)  gives 

(k,Ak)  = 0,  (65) 

showing  that,  for  each  value  of  k,  Ak  is  perpendicular  to  k.  This 
expresses  that  the  waves  are  transverse  waves.  Ak  is  determined  by 
its  two  components  in  two  directions  perpendicular  to  each  other  and 
to  k,  these  two  components  corresponding  to  two  independent  states 
of  linear  polarization. 

T-he  total  energy  of  the  radiation  is  given  by  the  volume  integral 

Hl;  = (87T)-1  j (<§2+ M2)  d3x  (66) 

taken  over  the  whole  of  space,  where  the  electric  field  € and  the 
magnetic  field  of  the  radiation  are  given  by 

_ 1 8A  ,.  , 

& = — , — curl  A.  (67) 

c 8t 

Using  standard  formulas  of  vector  analysis,  we  have 

divfAx.#]  = (,#,curl  A)— (A, curl  ft)  — M2—  (A,  curl  curl  A) 

= JV2+(A,V2A) 

with  the  help  of  the  second  of  equations  (62).  Thus  (66)  becomes, 
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with  neglect  of  a term  which  can  be  transformed  to  a surface  integral 
at  infinity, 

(*») 

By  substituting  for  A here  its  value  given  by  (63),  we  can  get  the 
energy  of  the  radiation  in  terms  of  the  Fourier  amplitudes  Ak.  The 
energy  of  the  radiation  is  constant  (since  we  are  now  ignoring  the 
interaction  of  the  radiation  and  the  atom),  so  in  this  calculation  we 
may  take  t = 0.  This  means  taking 


A = J (Ak+A_k)e~«k*>  d3k, 

V2A  = - J k2(A k+X_k)e-iM  d?k, 
8A/8t  = ic  j [k|(Ak— X._k)e~i(k!t>  dak. 


(69) 


(70) 


Inserting  these  expressions  in  (68),  we  get 

Hn  = (8*)-»  JJJ*  {k'2(Ak+S„k,  Ak.-f  A_k.)— 

- |k|  !k'|(Ak-X_k,  Ak — X_k-)}e_  1<kx)e  -<<k'x>  d3kd3k'd3x 


= tt2J  J {k'2(Ak+S_k,Ak.+X_k.)- 

- jk|  |k'j(Ak— X_k,  Ak,-A_k,)}8(k+k')  d3kdW, 
with  the  help  of  formula  (49)  of  § 23,  8(k+k')  being  the  product  of 
three  factors,  one  for  each  component  of  k.  Hence 


Hn  = 772  Jk2{(Ak+X_k,  A_k+5k)-  (Ak-X_k,A_k-Ak)}d3k 
- 2,72  J k2{(Ak,  Ak)-f  (A_k,  A_k)}  d3k 
= 4tt2  J k2(Ak,  Ak)  d3k.  (71) 


We  can  replace  the  continuous  distribution  of  k-vaiues  by  a dust  of 

discrete  k-values,  like  we  did  with  the  p-values  in  the  preceding 

section.  The  integral  (71)  then  goes  over,  according  to  formula  (52), 

into  the  sum  „ . » v* , »,  A -r-  , 

Hr  — 4tt2  ^ k2(Ak,  Ak)sk  *, 

k 

sk  being  the  density  of  the  discrete  k-values.  We  may  also  write 
this  as  Hr  = ^^AuAhs^\  (72) 

Aki  being  a component  of  Ak  in  a direction  1 perpendicular  to  k and 
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the  summation  with  respect  to  1 referring  to  two  directions  1 perpen- 
dicular to  each  other.  Thus  there  is  one  term  in  (72)  for  each  inde- 
pendent stationary  state  for  a photon. 

The  field  quantities  S and  M at  any  point  x can  be  looked  upon 
as  dynamical  variables.  The  quantities 

A —A  piniv^t  I, = J,  %e-2nivkt 

Afc  u — > -"klf  — -“klc 

are  then  dynamical  variables  at  time  t,  since  they  are  connected  with 
£ and  at  various  points  x at  time  t by  equations  which  do  not 
involve  t,  as  follows  from  (63)  and  (67).  Akl  is  constant,  so  AkU  varies 
with  t according  to  the  simple  harmonic  law.  Thus  Akit  is  like  the  t]t 
of  a harmonic  oscillator,  defined  by  (3)  of  § 34,  the  co  of  the  oscillator 
being  2jrvk.  We  may  take  each  AkU  to  be  proportional  to  the  tj,  of 
some  harmonic  oscillator  and  then  the  field  of  radiation  becomes  a 
set  of  harmonic  oscillators. 

Let  us  now  pass  over  to  the  quantum  theory  and  take  the  Akll,  Akll 
to  be  dynamical  variables  in  the  Heisenberg  picture.  The  expression 
(72)  for  the  energy  may  be  retained  unchanged,  the  order  in  which 
the  factors  ^4kl,  ilkl  there  occur  being  the  correct  one  to  give  no  zero- 
point  energy.  The  Hkl(  then  still  vary  with  time  according  to  the  eiwt 
law  and  may  still  be  taken  to  be  proportional  to  the  77/s  of  harmonic 
oscillators.  The  factor  of  proportionality  may  be  obtained  by  equat- 
ing (72)  to  the  expression  (39)  for  the  energy,  with  the  label  a replaced 
by  the  two  labels  k and  1 and  with  hvk  for  hwa.  This  gives 

4tt2  g hvk  VkU  ijkll, 

the  suffix  t being  inserted  to  show  that  we  are  dealing  with  Heisenberg 
dynamical  variables  (as  we  should  when  transferring  equations  of  the 
classical  theory  to  the  quantum  theory).  Hence,  using  (64), 

4w2AkU  = cA4yk*7?kl(4,  (73) 

with  neglect  of  an  unimportant  arbitrary  phase  factor.  In  this  way 
the  Heisenberg  dynamical  variables  rjkU,  which  describe  the  field  of 
radiation  as  a set  of  oscillators,  are  introduced.  The  commutation 
relations  between  the  t?kl(  and  fjkU  are  known,  being  given  by  (11),  so 
equation  (73)  fixes  the  commutation  relations  between  the  ^4kl<  and 
AkU.  It  thus  fixes  the  commutation  relations  between  the  potentials 
A and  the  field  quantities  6 and  M at  various  points  x at  the  time  t. 
(Incidentally,  the  commutation  relations  of  the  AkV  Akl  are  fixed, 
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so  the  commutation  relation  of  two  potential  or  field  quantities  at 
two  different  times  is  also  fixed.) 

We  can  still  use  (73)  when  the  interaction  between  the  field  of 
radiation  and  the  atom  is  taken  into  account.  This  involves  assuming 
that  the  interaction  does  not  affect  the  commutation  relations 
between  the  potentials  and  field  quantities  at  a given  time.  The 
interaction  causes  the  rjkU’a  to  cease  to  vary  according  to  the  simple 
harmonic  law  and  the  oscillators  to  cease  to  be  harmonic.  Thus  it 
may  affect  the  commutation  relation  between  two  potential  or  field 
quantities  at  two  different  times. 

We  can  now  take  over  the  interaction  energy  (61)  into  the  quantum 
theory,  putting  pt  for  p to  show  it  is  a Heisenberg  dynamical  variable. 
Taking  the  atomic  nucleus  to  be  at  the  origiiwve  get,  by  substituting 
(63)  with  X = 0 into  (61), 

HQi  = — J (P/>  Aki+^k<)  d3&+ 

+ 2 (Ak i+Sk(>  Ak,(+Xk,()  d*kdW 
~ mc^4  ( W At<+ 1 -f  — -j  ^ (Att+Xjtf,  Ak.t+Xk7)«k 

k ki? 

if  we  pass  from  continuous  to  discrete  k-values.  Thus 
HqI  — — 2 i,«(-^k«+-^ku)5k1  + 

TOC  ^ 

2 

2 ^ku+^kiz)(^k'i7+^’k'n)(ll,)5k1'sk-1, 
kkTr 

Pu  being  the  component  of  p,  in  the  direction  1.  With  the  help  of  (73) 
we  may  express  HQt  in  terms  of  the  rjkU  and  rjktt,  and  we  can  then  drop 
the  suffix  t (which  means  going  over  to  Schrodinger  dynamical 
variables),  so  that  we  obtain  finally 

Hq  = + 

kl 

+ 2 vk^i(^ki  + ^ki)(’?kT+’?kr)(1l')«ki^i-  (74) 

With  the  model  of  the  atom  we  are  using,  the  interaction  energy 
appears  as  a linear  plus  a quadratic  function  in  the  rj’a  and  rj’s.  The 
linear  terms  give  rise  to  emission  and  absorption  processes,  the 
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quadratic  ones  to  scattering  processes  and  processes  in  which  two 
photons  are  absorbed  or  emitted  simultaneously.  The  order  of  the 
factors  17  and  77  in  the  quadratic  terms  is  not  determined  by  the 
procedure  of  working  from  the  classical  theory,  but  this  order  is 
unimportant,  since  a change  in  it  merely  changes  HQ  by  a constant. 

The  matrix  element  of-  HQ  referring  to  the  emission  of  a photon 
into  the  discrete  state  kl,  or  into  the  discrete  state  p'l,  as  it  may  also 
be  labelled,  with  the  atom  jumping  from  state  a0  to  state  a',  is 


<p'nla'|//Q!a;0>  = [p  i|a°>Sk  * = 

‘tTT-rtiv  s mn\2nv  p 


<a'|p,|a°>5pi 


since  = ,s‘p  fiz.  Tlie  pl  occurring  here,  referring  to  the  momentum 
of  the  electron,  is,  of  course,  quite  distinct  from  the  other  letters  p, 
referring  to  the  momentum  of  the  emitted  photon.  To  avoid  con- 
fusion we  shall  replace  the  electron  momentum  p by  mx,  these  two 
dynamical  variables  being  the  same  for  the  unperturbed  atom.  Pass- 
ing over  to  continuous  photon  states  by  means  of  the  conjugate 
imaginary  of  equation  (56),  we  get 

<pT«'|//0l«°>  = (a'lAico).  (75) 

h(2nu  )* 


Similarly,  the  matrix  element  of  HQ  referring  to  the  absorption  of  a 
photon  from  the  continuous  state  p°l  with  the  atom  jumping  from 
state  a0  to  state  a is 

<«'|//0!p"la°>  = — <a'|i,|a«>,  (76) 


and  the  matrix  element  referring  to  the  scattering  of  a photon  from 
the  continuous  state  p°l°  to  the  continuous  state  p'l'  with  the  atom 
jumping  from  state  a.0  to  state  a'  is 


<pTa'!//g!p0l°a0>  = 


2ir AWVi(l 


(77) 


there  being  two  terms  in  (74)  which  contribute  to  it.  These  matrix 
elements  will  be  used  in  the  next  section.  The  matrix  elements 
referring  to  the  simultaneous  absorption  or  emission  of  two  photons 
may  be  written  down  in  the  same  way,  but  they  lead  to  physical 
effects  too  small  to  be  of  practical  importance. 


64.  Emission,  absorption,  and  scattering  of  radiation 

We  can  now  determine  directly  the  coefficients  of  emission,  absorp- 
tion, and  scattering  of  radiation  by  substituting  in  the  formulas  of 
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Chapter  VIII  the  values  for  the  matrix  elements  given  by  (75),  (76), 
and  (77). 

For  determining  the  emission  probability  we  can  use  formula 
(56)  of  § 53.  This  shows  that  for  an  atom  in  a state  a0  the  proba- 
bility per  unit  time  per  unit  solid  angle  of  its  spontaneously  emitting 
a photon  and  dropping  to  a state  a'  of  lower  energy  is 


4t fWF 
h c2 


e 1 

h w 


-{\x  urpcr 
Z-n  rH  ' 


(78) 


Now  the  energy  and  momentum  of  a photon  of  frequency  v are 


w = hv,  r ----  hvjc. 

Again,  from  the  Heisenberg  law  (20)  of  § 29,  ; 

<a'|.Xj|a0>  = — 27rtV(a°a')s(o?i(ar,io(0), 

i’(a.°oc')  being  the  frequency  connected  with  transitions  from  state  a0 
to  state  which  in  the  present  case  is  just  the  frequency  v of  the 
emitted  radiation.  These  results  substituted  in  (78)  make  the  emis- 
sion coefficient  reduce  to 


(2H3 

"Ac3 


|<y|e.r,|a°>|2. 


(79) 


To  obtain  the  rate  of  emission  of  energy  per  unit  solid  angle  for  a 
specified  polarization,  we  must  multiply  this  by  hv.  This  gives  for 
the  total  rate  of  emission  of  energy  in  all  directions 

J(-^l<*>xK>|2,  (80) 

o C 


which  is  in  agreement  with  expression  (34)  of  § 45  and  justifies  Heisen- 
berg’s assumption  for  the  interpretation  of  his  matrix  elements. 

In  the  same  way  the  absorption  coefficient,  given  by  formula 
(59)  of  § 53,  becomes  for  photons 


47T2h2W\e 
c*P 


c 


I h (2ttv)$ 

This  absorption  coefficient  refers  to  an  incident  beam  of  one  photon 
crossing  unit  area  per  unit  time  per  unit  energy  range.  If  we  take 
one  per  unit  frequency  range  instead  of  energy  range,  as  is  usual 
when  dealing  with  radiation,  the  absorption  coefficient  becomes 


-T__  |<a'|ea;1|Q£U>|2. 

This  result  is  the  same  as  (32)  of  § 45,  if  we  substitute  for  the  Ev 
there  the  energy  hv  of  a single  photon.  Thus  the  elementary  theory 

3595.57  B • 
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o/§  45,  in  which  the  radiation  field  is  treated  as  an  external  perturba- 
tion, gives  the  correct  value  for  the  absorption  coefficient. 

This  agreement  between  the  elementary  theory  and  the  present 
theory  could  be  inferred  from  general  arguments.  The  two  theories 
differ  only  in  that  the  field  quantities  all  commute  with  one  another 
in  the  elementary  theory  and  satisfy  definite  commutation  relations 
in  the  present  theory,  and  this  difference  becomes  unimportant  for 
strong  fields.  Thus  the  two  theories  must  give  the  same  absorption 
and  emission  when  strong  fields  are  concerned.  Since  both  theories 
give  the  rate  of  absorption  proportional  to  the  intensity  of  the  inci- 
dent beam,  the  agreement  must  hold  also  for  weak  fields  in  the  case  of 
absorption.  In  the  same  way  the  stimulated  part  of  the  emission  in  the 
present  theory  must  agree  w'ith  the  emission  in  the  elementary  theory. 

Let  us  now  consider  scattering.  The  direct  scattering  coefficient  is 
given  by  formula  (38)  of  § 50.  Such  scattering  of  photons  will  not  be 
accompanied  by  any  change  of  state  of  the  atom  on  account  of  the 
factor  Sa  ao  in  the  expression  for  the  matrix  element  (77).  Thus  the 
final  energy  W'  of  the  photon  will  equal  its  initial  energy  IT0.  The 
scattering  coefficient  now  reduces  to 

e4/m2c4.(Tl0)2. 

This  is  the  same  as  that  given  by  classical  mechanics  for  the  scattering 
of  radiation  by  a free  electron.  We  thus  see  that  the  direct  scatter- 
ing of  radiation  by  an  electron  in  an  atom  is  independent  of  the  atom 
and  is  correctly  given  by  the  classical  theory.  This  result,  it  should 
be  remembered,  holds  only  provided  the  wavelength  of  the  radiation 
is  large  compared  with  the  dimensions  of  the  atom. 

The  direct  scattering  is  a mathematical  concept  and  cannot  be 
separated  out  experimentally  from  the  total  scattering,  given  by 
formula  (44)  of  § 51.  Let  us  see  what  this  total  scattering  is  in  the 
case  of  photons.  We  must  be  careful  in  our  application  of  formula 
(44)  of  § 51.  The  summation  2 in  this  formula  may  be  considered  as 

representing  the  contribution  to  the  scattering  of  double  transitions 
consisting  of  transitions  firstly  from  the  initial  state  to  state  k and 
secondly  from  state  k to  the  final  state.  The  first  transition  may  be 
an  absorption  of  the  incident  photon  and  the  second  an  emission  of 
the  required  scattered  photon,  but  it  is  also  possible  for  the  first 
transition  to  be  the  emission  and  the  second  the  absorption.  It  is 
clear  from  the  general  nature  of  the  method  used  for  deriving  formula 
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(44)  of  § 51  that  both  these  kinds  of  double  transitions  must  be  in- 
cluded in  the  summation  ^ when  this  formula  is  applied  to  photons, 

although  only  the  first  of  them  appears  in  the  actual  derivation  given 
in  § 51,  as  the  possibility  of  the  particle  being  created  or  annihilated 
was  not  taken  into  account  there. 

We  use  zero,  single  prime,  and  double  prime  to  refer  to  the  initial, 
final,  and  intermediate  states  of  the  atom  respectively,  and  zero  and 
single  prime  to  refer  to  the  absorbed  and  emitted  photons  respec- 
tively. Then,  for  the  double  transition  of  absorption  followed  by 
emission,  we  must  take  for  the  matrix  elements 

<*|F|pV>,  <PV|F|*£  ' 

of  the  formula  (44)  of  § 51 

<&|F|p°a°>  = <a'|ffQ|pW»>,  <pV|F|*>  = <pTa'.|tfQ|a'>. 

Also  E'-Ek  = hv° -f- HP(a.°) — Hp(  a." ) = %(-v(«V)], 

where  hv(a.’a.0)  — Hp(a") — HP(ofi). 

Similarly,  for  the  double  transition  of  emission  followed  by  absorption 
we  must  take 

<&|F|p°a°>  = <pTa'|ff0|a»>,  <pV|F|4>  = <«'|£T0|p«lV> 

and 

E'-Ek  = hv*+HP{<x°)-HP(<x")-hv*-hv'  = -A[v'+v(“'«°/]> 
there  being  now  two  photons,  of  frequencies  v°  and  v , in  existence 
for  the  intermediate  state.  Substituting  in  (44)  of  § 51  the  values  of 
the  matrix  elements  given  by  (75),  (76),  and  (77),  we  get  for  the 
scattering  coefficient 


Vl°)Sa.a.+ 

WL 


h2c 4 v°\m 


+ 


?! 


<g'|xl-|a"><a"|^1t|a°>  < 


-y(a"a°) 


a,|X|i>[g',><a',|Xi-|a0>|[2  ^ 

v'-f-^a'a0)  /| 


If  we  write  (81)  in  terms  of  x instead  of  x,  we  get 
(2-7re)4 


h 


h2 c4  y°  2irm 


(1'1°)  8a.a.  — 2 »'(a'a")y(a*a0)| 

nt*  ' 


< 


<a,|sr|a*X<**la;i»la(>> 

y°-y(a'a°) 

a>1,|a»Xa*Ma°>]|a  ^ 

v'-)-y(a'a0)  || 

We  can  simplify  (82)  with  the  help  of  the  quantum  conditions. 
We  have  _ _ _ _ _ A 

«T|4«T|I  Xy  Vj 


248  THEORY  OF  RADIATION  § 64 

which  gives 

y {<a'  | £,. ]a'> <a"  | £,.  | a0)  — <a ' | a^,  | a*> <a"  [a;,,  j a°>}  = 0,  (83) 

or 

and  also 

xl.xl»—xloXr  = l/m.(xvpl,—pl,zl)  — ih/m.(V  1°), 
which  gives 

2 {<a'|%|a").v’(a'a°)<Q;''[a;iolct0>— v(ci!'a")<a'|a:lt,|a'>.  <“"|%|a°>} 

“ S<,  |0)s—  = <84> 

Multiplying  (83)  by  v'  and  adding  to  (84),  we  obtain 
2 {<<*' \xr  | a">  <a"  |Xj»  |a°>[i'' + v(a*a0)] — <a'  \Xp  |a*>  <a"  \xr  |a°>|y' + v(a'a")]} 

- ft/2^m.(l'l»)8a.a.. 

If  we  substitute  this  expression  for  hj2rrm.  (ri°)8a.at  in  (82),  we 
obtain,  after  a straightforward  reduction  making  use  of  identical 
relations  between  the  vs, 

(2we)4  Q ,3j  V f\a  |%|.|«  )<[«  IXj.lot0) <(ot  lajolot  l^i-la0))!2  .nr, 

h2c*  I ^ l v° — *'(«wa°)  v'-fMa"^)  || 

This  gives  the  scattering  coefficient  in  the  form  of  the  effective 
area  that  a photon  has  to  hit  per  unit  solid  angle  of  scattering.  It  is 
known  as  the  Krarners-Heisenberg  dispersion  formula,  having  been  first 
obtained  by  these  authors  from  analogies  with  the  classical  theory 
of  dispersion. 

The  fact  that  the  various  terms  in  (82)  can  be  combined  to  give 
the  result  (85)  justifies  the  assumption  made  in  deriving  formula  (44) 
of  § 51,  that  the  matrix  elements  <p'a'|F!pV'>  of  the  interaction 
energy  are  of  the  second  order  of  smallness  compared  with  the 
<pV  | Fjfc>  ones,  at  any  rate  when  the  scattered  particles  are  photons. 

65.  An  assembly  of  fermions 

An  assembly  of  fermions  can  be  treated  by  a method  similar  to 
that  used  in  §§  59  and  60  for  bosons.  With  the  kets  (1)  we  may  use 
the  antisymmetrizing  operator  A defined  by 

A = «'H  2 ±P,  (2') 

summed  over  all  permutations  P,  the  or  — sign  being  taken 
according  to  whether  P is  even  or  odd.  Applied  to  the  ket  (1)  it  gives 
2 ±PKc4 <*§...«&>  = A|c^a6ac...a('>,  (3') 

a ket  corresponding  to  a state  for  an  assembly  of  u'  fermions.  The 
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ket  (3')  is  normalized  provided  the  individual  fermionkets  |a®>,  |a6>,... 
are  all  different,  otherwise  it  is  zero.  In  this  respect  the  ket  (3')  is 
simpler  than  the  ket  (3).  However,  (3')  is  more  complicated  than  (3) 
in  that  (3')  depends  on  the  order  in  which  a®,  ab*  ac,...  occur  in  it, 
being  subject  to  a change  of  sign  if  an  odd  permutation  is  applied 
to  this  order. 

We  can,  as  before,  introduce  the  numbers  nv  n2,  n3,...  of  fermions 
in  the  states  a(1),  a(a),  a<3>,...  and  treat  them  as  dynamical  variables  or 
observables.  They  each  have  as  eigenvalues  only  0 and  1.  They  form 
a complete  set  of  commuting  observables  for  the  assembly  of  fermions. 
The  basic  kets  of  a representation  with  the  n’s  diagonal  may  be  taken 
to  be  connected  with  the  kets  (3')  by  the  equation 

A\oiaab<xc...oi?y  = ±\r^ri2n3.\.'>  (6') 

corresponding  to  (6),  the  n” s being  connected  with  the  variables 
a®,  Op,  cP...  by  equation  (4).  The  ± sign  is  needed  in  (6')  since,  for 
given  n" s,  the  occupied  states  a®,  oP,  of,...  are  fixed  but  not  their 
order,  so  that  the  sign  of  the  left-hand  side  of  (6')  is  not  fixed.  To 
set  up  a rule  which  determines  the  sign  in  (6'),  we  must  arrange  all 
the  states  a for  a fermion  arbitrarily  in  some  standard  order.  The 
a’s  occurring  in  the  left-hand  side  of  (6')  form  a certain  selection  from 
all  the  a’s  and  the  standard  order  for  all  the  a’s  will  give  a standard 
order  for  this  selection.  We  now  make  the  rule  that  the  + sign  should 
occur  in  (6')  if  the  a’s  on  the  left-hand  side  can  be  brought  into  their 
standard  order  by  an  even  permutation  and  the  — sign  if  an  odd 
permutation  is  required.  Owing  to  the  complexity  of  this  rule, 
the  representation  with  the  basic  kets  \ri1ri2n’z...y  is  not  a very 
useful  one. 

If  the  number  of  fermions  in  the  assembly  is  variable,  we  can  set 
up  the  complete  set  of  kets 

|>,  |a®>,  A |a°a6>,  A |a®abac>,  ...,  (9') 

corresponding  to  (9).  A general  ket  is  now  expressible  as  a sum  of 
the  various  kets  (9'). 

To  continue  with  the  development  we  introduce  a set  of  linear 
operators  77,  rj,  one  pair  rja,  rja  corresponding  to  each  fermion  state  a®, 
satisfying  the  commutation  relations 

VaVb+VbVa^ 

VaVb+VbVa  = °>  (n') 

7]a7lbA~7]bVa  ^ab’  - 
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These  relations  are  like  (11)  with  a + sign  instead  of  a - on  the  left- 
hand  side.  They  show  that,  for  a ^ b,  Va  and  f,a  anticommute  with 
% and  Tjb,  while,  putting  b — a,  they  give 

1?“=  0’  VaVa+yaVa=  1-  (11") 

To  verify  that  the  relations  (IT)  are  consistent,  we  note  that  linear 
operators  v,  ij  satisfying  the  conditions  (11')  can  be  constructed  in 
the  following  way.  For  each  state  «■  we  take  a set  of  linear  operators 

°xca'  °va'  like  the  °v>  °z  introduced  in  § 37  to  describe  the  spin 
of  an  electron  and  such  that  <rxa,  <rva,  v2a  commute  with  <r6,  „ * 

for  6 ^ a.  We  also  take  an  independent  set  of  linear  operators  £ 
one  for  each  state  *«,  which  all  anticommute  with  one  another  and 
have  their  squares  unity,  and  commute  with  all  the  a variables 
then,  putting 

Va  — Ua(^xa~i°Va)>  Va  = KaKa+l^J, 
we  have  all  the  conditions  (IT)  satisfied. 

From  (11") 

. . ^ = T)a  Va  ~ r>a^1  ~ Va  *ia)Va  — Va  Va- 

This  is  an  algebraic  equation  for  Vaija,  showing  that  Vaija  is  an 
observable  with  the  eigenvalues  0 and  1.  Also  Varja  commutes  with 
Vb  Vb  i°r  o a.  These  results  allow  us  to  put 

VaVa  = na,  (12') 

the  same  as  (12).  From  (11")  we  get  now 

VaVa  — 1 —na> 

the  equation  corresponding  to  (13). 

Let  us  write  the  normalized  ket  which  is  an  eigenket  of  all  the 
belonging  to  the  eigenvalues  zero  as  |0>.  Then 

»J  0>  = 0, 

ao  from  (12')  <0|,.,-.|0>  = n. 

"en“  = 0, 
like  (15).  Again 

<°l^a’?al°>  = <0 1 ( 1 — re„)|0>  = <0  jO)  — 1, 
showing  that  *7 JO)  is  normalized,  and 

narla\()'>  = VaV„Va\ °>  = 1?„(  1 — Wu)  |0>  = 7jJ0>, 
showing  that  ,J0>  is  an  eigenket  of  na  belonging  to  the  eigenvalue 
umty.  It  is  an  eigenket  of  the  other  n’s  belonging  to  the  eigenvalues 
zero,  since  the  other  n’s  commute  with  rla.  By  generalizing  the 
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argument  we  see  that  r]a-qbr]c...T]g\0)  is  normalized  and  is  a simul- 
taneous eigenket  of  all  the  n’ s,  belonging  to  the  eigenvalues  unity 
for  na,  nb,  nc,...,  ng  and  zero  for  the  other  n’s.  This  enables  us  to  put 

A\a.aoLbof...oiP)—  rja  T]b  rjc...r)g\0),  (17') 


both  sides  being  antisymmetrical  in  the  labels  a,  b,  c,...,  g.  We  have 
here  the  analogue  of  (17).  The  ij’s  appear  as  creation  operators  for 
a fermion  and  the  rj’s  as  annihilation  operators. 

If  we  pass  over  to  a different  set  of  basic  kets  \f3A } for  a fermion, 
we  can  introduce  a new  set  of  linear  operators  r)A  corresponding  to 
them.  We  then  find,  by  the  same  argument  as  in  the  case  of  bosons, 
that  the  new  tj’s  are  connected  with  the  original  ones  by  (21).  This 
shows  that  there  is  a procedure  of  second  quantization  for  fermions, 
similar  to  that  for  bosons,  with  the  only  difference  that  the  commu- 
tation relations  (IT)  must  be  employed  for  fermions  to  replace  the 
commutation  relations  (11)  for  bosons. 

A symmetrical  linear  operator  UT  of  the  form  (22)  can  be  expressed 
in  terms  of  the  ij,  rj  variables  by  a method  similar  to  that  used  for 
bosons.  Equation  (24)  still  holds,  and  so  does  (25)  with  S replaced 
by  A.  Instead  of  (26)  we  now  have 

UTr)XlVx,-\°)  = l,I,(  — )r~1VaVxri‘nxlVxi-  l°X«|tf|*r> 


( — (26') 


r]~l  meaning  that  the  factor  TjIr  must  be  cancelled  out,  without  its 
position  among  the  other  rjx’s  being  changed  before  the  cancellation.  m 
Instead  of  (27)  we  have 

VbVxiVx,- 1°)  = (-Y^Vxr1  Vx,  !°>8tuv  (27') 


so  (28)  holds  unchanged  and  thus  (29)  holds  unchanged.  We  have 
the  same  final  form  (29)  for  UT  in  the  fermion  case  as  in  the  boson 
case.  Similarly,  a symmetrical  linear  operator  VT  of  the  form  (30)  can 
be  expressed  as  ^ = ^ ^(abWlcd)^,  (35') 


the  same  as  one  of  the  ways  of  writing  (35). 

The  foregoing  work  shows  that  there  is  a deep-seated  analogy 
between  the  theory  of  fermions  and  that  of  bosons,  only  slight 
changes  having  to  be  made  in  the  general  equations  of  the  formalism 
when  one  passes  from  one  to  the  other. 
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There  is,  however,  a development  of  the  theory  of  fermions  that 
has  no  analogue  for  bosons.  For  fermions  there  are  only  the  two 
alternatives  of  a state  being  occupied  or  unoccupied  and  there  is 
symmetry  between  these  two  alternatives.  One  can  demonstrate  the 
symmetry  mathematically  by  making  a transformation  which  inter- 
changes the  concepts  of  ‘occupied’  and  ‘unoccupied’,  namely 

Va  = Va’  Va  = Va> 

K = Va  Va  = -!—»„• 

The  creation  operators  of  the  unstarred  variables  are  the  annihilation 
operators  of  the  starred  variables,  and  vice  versa.  The  starred  variables 
are  now  seen  to  satisfy  the  same  quantum  conditions  and  to  have  all 
the  same  properties  as  the  unstarred  ones. 

It  there  are  only  a few  unoccupied  states,  a convenient  standard 
ket  to  work  with  would  be  the  one  for  which  every  state  is  occupied, 
naluely  [()*)  satisfying 

nJO*>  = |0*>. 

It  thus  satisfies  «*|0*>  = 0, 

or  y*\0*)  = 0. 

Other  states  for  the  assembly  will  now  be  represented  by 

VaVtVc- |0*>, 

in  which  variables  appear  referring  to  the  unoccupied  fermion  states 
2,  b,  f ... . We  may  look  upon  these  unoccupied  fermion  states  as  holes 
among  the  occupied  ones  and  the  y*  variables  as  the  operators  of 
creation  of  such  holes.  The  holes  are  just  as  much  physical  things 
as  the  original  particles  and  are  also  fermions. 
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66  Relativistic  treatment  of  a particle 

The  theory  we  have  been  building  up  so  far  is  essentially  a non- 
relativistie  one.  We  have  been  working  all  the  time  with  one  par- 
ticular Lorentz  frame  of  reference  and  have  set  up  the  theory  as  an 
analogue  of  the  classical  non-relativistic  dynamics.  Let  us  now  try 
to  make  the  theory  invariant  under  Lorentz  transformations,  so  that 
it  conforms  to  the  special  principle  of  relativity.  This  is  necessary  m 
order  that  the  theory  may  apply  to  high-speed  particles.  There  is  no 
need  to  make  the  theory  conform  to  general  relativity,  since  genera 
relativity  is  required  only  when  one  is  dealing  with  gravitation,  and 
gravitational  forces  are  quite  unimportant  m atomic  phenomena. 

Let  us  see  how  the  basic  ideas  of  quantum  theory  can  be  adapted 
to  the  relativistic  point  of  view  that  the  four  dimensions  of  space - 
time  should  be  treated  on  the  same  footing.  The  general  principle 
71 uperp».tion  of  state,,  a,  given  in  Chapter  I is  a relat,v,.«c 
principle,  since  it  applies  to  ‘states’  with  the  relativistic  space -t  me 
meaning.  However,  the  general  concept  of  an  observable  does  not  fit 
in  since  an  observable  may  involve  physical  things  at  widely  grated 
points  at  one  instant  of  time.  In  consequence,  if  one  works  with  a 
general  representation  referring  to  any  complete  set  of  commuting 
observables,  the  theory  cannot  display  the  symmetry  between  space 
and  time  required  by  relativity.  In  relativistic  quantum  mechanics 
one  must  be  content  with  having  one  representation  which  disp  ays 
this  symmetry.  One  then  has  the  freedom  to  transform  to  another 
representation  referring  to  a special  Lorentz  frame  of  reference 

is  useful  for  a particular  calculation. 

For  the  problem  of  a single  particle,  in  order  to  display  the  sym- 
metry between  space  and  time  we  must  use  the  ^ro(hngCT  repi^ 
sentation.  Let  us  put  aq,  *a,  % for  aq  y,  z,  and  aq  for  ci.  Ihe  t 
dependent  wave  function  then  appears  as  ^(aqaqaqaq)  anc  prow 
us  with  a basis  for  treating  the  four  *’s  on  the  same  footing. 

We  shall  use  relativistic  notation,  writing  the  four  * s as  * 
tlL  = a ] 2 3).  Any  space-time  vector  with  four  components  which 
transform  under  Lorentz  transformations  likejhe  four  elements  * 
will  be  written  like  a(t  with  a lower  Greek  suffix.  We  may  raise 
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suffix  according  to  the  rules 

a0  =-  a0,  a1  = —av  a 8 = — ait  a8  = — a3.  (1) 

The  are  called  the  contravariant  components  of  the  vector  a,  and 
the  aP  the  covariant  components.  Two  vectors  and  have  a 
Lorentz -invariant  scalar  product 

a0b0  cqiq  &%b3 — 6(363  = aPb^  = a^bP, 

a summation  being  implied  over  a repeated  letter  suffix.  The  funda- 
mental tensor  gPv  is  defined  by 


9°°^h  gll  __  ^22  __  ^33  _ _t>  t 

gPv  = 0 for  ^ 

With  its  help  the  rules  (1)  connecting  covariant  and  contravariant 
components  may  be  written 

aP  = g^av. 

In  the  SchrOdinger  representation  the  momentum,  whose  com- 
ponents will  now  be  written  plt  pt,  p3  instead  of  px,  pu,  pz,  is  equal 
to  the  operator 

pr  = —ihd/dxr  (r  = 1,2,  3).  (3) 

Now  the  four  operators  8/dx^  form  the  covariant  components  of  a 
4-vector  whose  contravariant  components  are  written  8/8xp.  So  to 
bring  (3)  into  a relativistic  theory,  we  must  first  write  it  with  its 
.uffixea  balanced,  Pr  = «e/&, 

and  then  extend  it  to  the  complete  4-vector  equation 

Ph  = ihd/8: vp.  (4) 

We  thus  have  to  introduce  a new  dynamical  variable  p0,  equal  to 
the  operator  ih  8/8x0.  Since  it  forms  a 4-vector  when  combined  with  the 
momenta  pr,  it  must  have  the  physical  meaning  of  the  energy  of  the 
particle  divided  by  c.  We  can  proceed  to  develop  the  theory  treating 
the  four  p’s  on  the  same  footing,  like  the  four  z’s. 

In  the  theory  of  the  electron  that  will  be  developed  here  we  shall 
have  to  introduce  a further  degree  of  freedom  describing  an  internal 
motion  of  the  electron.  The  wave  function  will  thus  have  to  involve 
a further  variable  besides  the  four  x’s. 


67.  The  wave  equation  for  the  electron 

Let  us  consider  first  the  case  of  the  motion  of  an  electron  in  the 
absence  of  an  electromagnetic  field,  so  that  the  problem  is  simply 
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that  of  the  free  particle,  as  dealt  with  in  § 30,  with  the  possible 
addition  of  internal  degrees  of  freedom.  The  relativistic  Hamiltonian 
provided  by  classical  mechanics  for  this  system  is  given  by  equation 
(23)  of  § 30,  and  leads  to  the  wave  equation 

{Po— (mac2+pf+p|+p|)*}^r  = 0,  (5) 

where  the  p’s  are  interpreted  as  operators  in  accordance  with 
(4).  Equation  (5),  although  it  takes  into  account  the  relation  between 
energy  and  momentum  required  by  relativity,  is  yet  unsatisfactory 
from  the  point  of  view  of  relativistic  theory,  because  it  is  very  un- 
symmetrical  between  p0  and  the  other  p’s,  so  much  so  that  one  cannot 
generalize  it  in  a relativistic  way  to  the  case  when  there  is  a field 
present.  We  must  therefore  look  for  a new  wave  equation. 

If  we  multiply  the  wave  equation  (5)  onUhe  left  by  the  operator 
{i>o+(wi2c2+pf+p|+p|)»},  we  obtain  the  equation 

{Po— m2c2-p2— pi — p|}0  = 0,  (6) 

which  is  of  a relativistically  invariant  form  and  may  therefore  more 
conveniently  be  taken  as  the  basis  of  a relativistic  theory.  Equation 
(6)  is  not  completely  equivalent  to  equation  (5)  since,  although  every 
solution  of  (5)  is  also  a solution  of  (6),  the  converse  is  not  true.  Only 
those  solutions  of  (6)  belonging  to  positive  values  for  p0  are  also 
solutions  of  (5). 

The  wave  equation  (6)  is  not  of  the  form  required  by  the  general 
laws  of  the  quantum  theory  on  account  of  its  being  quadratic  in  p0. 
In  § 27  we  deduced  from  quite  general  arguments  that  the  wave 
equation  must  be  linear  in  the  operator  djdt  or  p0,  like  equation  (7) 
of  that  section.  We  therefore  seek  a wave  equation  that  is  linear 
in  p„  and  that  is  roughly  equivalent  to  (6).  In  order  that  this  wave 
equation  shall  transform  in  a simple  way  under  a Lorentz  transforma- 
tion, we  try  to  arrange  that  it  shall  be  rational  and  linear  in  px,  p2, 
and  p3  as  well  as  in  p0,  and  thus  of  the  form 

(Po— “1^1— “2^2— “3?>s— = 0,  (7) 

where  the  a’s  and  /?  are  independent  of  the  p’s.  Since  we  are  consider- 
ing the  case  of  no  field,  all  points  in  space-time  must  be  equivalent, 
so  that  the  operator  in  the  wave  equation  must  not  involve  the  x’s. 
Thus  the  oj’s  and  )3  must  also  be  independent  of  the  x’s,  so  that  they 
must  commute  with  the  p’s  and  the  x’s.  They  therefore  describe 
some  new  degree  of  freedom,  belonging  to  some  internal  motion  in 
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the  electron.  We  shall  see  later  that  they  bring  in  the  spin  of  the 
electron. 

Multiplying  (7)  by  the  operator  {p0+ai75j  + a2P2+a:s?>3+i3}  on  the 
left,  we  obtain 

{Po~  23[alPl  + (al0£2  + a£2“l)pl?)2+(ai^+^«l)Pi]—  jS2}^  — 0, 
where  £ refers  to  cyclic  permutations  of  the  suffixes  1,  2.  3.  This  is 

123 

the  same  as  (6)  if  the  a’s  and  /9  satisfy  the  relations 

*1=1.  ai<*2+“2«1  = 0, 

P2  = m2c2,  = 0, 

together  with  the  relations  obtained  from  these  by  permuting  the 
suffixes  1,  2,  3.  If  we  write 

P = 

these  relations  may  be  summed  up  in  the  single  one, 

“«*&+«&■««  = 2SU6  (a,  6=1,2,  3,  or  to).  (8) 

The  four  a’s  all  anticommute  with  one  another  and  the  square  of 
each  is  unity. 

Thus  by  giving  suitable  properties  to  the  a’s  and  p we  can  make 
the  wave  equation  (7)  equivalent  to  (0),  in  so  far  as  the  motion  of 
the  electron  as  a whole  is  concerned.  We  may  now  assume  (7)  is  the 
correct  relativistic  wave  equation  for  the  motion  of  an  electron  in 
the  absence  of  a field.  This  gives  rise  to  one  difficulty,  however, 
owing  to  the  fact  that  (7),  like  (6),  is  not  exactly  equivalent  to  (5), 
but  allows  solutions  corresponding  to  negative  as  well  as  positive 
values  of  pt).  The  former  do  not,  of  course,  correspond  to  any  actually 
observable  motion  of  an  electron.  For  the  present  we  shall  consider 
only  the  positive-energy  solutions  and  shall  leave  the  discussion  of 
the  negative-energy  ones  to  § 73. 

We  can  easily  obtain  a representation  of  the  four  a’s.  They  have 
similar  algebraic  properties  to  the  cr’s  introduced  in  § 37,  which  cr’s 
can  be  represented  by  matrices  with  two  rows  and  columns.  So  long 
as  we  keep  to  matrices  with  two  rows  and  columns  we  cannot  get  a 
representation  of  more  than  three  anticommuting  quantities,  and  we 
have  to  go  to  four  rows  and  columns  to  get  a representation  of  the 
four  anticommuting  a’s.  It  is  convenient  first  to  express  the  a’s  in 
terms  of  the  u’s  and  also  of  a second  similar  set  of  three  anti  com- 
muting variables  whose  squares  are  unity,  pu  p2,  p3  say,  that  are 
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independent  of  and  commute  with  the  o-’s.  We  may  take,  amongst 
other  possibilities, 

= Pi  ffj,  a2  = Pi  a3  = Pi  °3»  am  — P3> 

and  the  a’s  will  then  satisfy  all  the  relations  (8),  as  may  easily  be 
verified.  If  we  now  take  a representation  with  p3  and  cr3  diagonal, 
we  shall  get  the  following  scheme  of  matrices: 

CTj  = / 0 1 0 0\  a2  — (0 — i 0 0\  <t3  = / 1 0 0 0\ 

11  0 0 0 1 It  0 0 0 1 I o — 1 0 0| 

loo  Oil  1 0 0 0 — i I lo  0 1 0 I 

\0  0 1 0/  \o  0 i o!  \0  0 0 —1/ 

Pi=  / 0 o 1 0\  Pi  — (0  0 — i 0\  >3=  /I  0 0 0\ 

/ 0 0 0 1 1 j 0 0 0 — i 1 jo  1 0 0 | 

lioool  I i 0 0 0 I |o  0—1  0 

\o  1 0 0/  \0  i 0 0/  \0  o 0—1/. 

It  should  be  noted  that  the  p’a  and  cr’s  are  all  Hermitian,  which  makes 
the  a’s  also  Hermitian. 

Corresponding  to  the  four  rows  and  columns,  the  wave  function  V' 
must  contain  a variable  that  takes  on  four  values,  in  order  that  the 
matrices  shall  be  capable  of  being  multiplied  into  it.  Alternatively, 
we  may  look  upon  the  wave  function  as  having  four  components,  each 
a function  only  of  the  four  x’a.  We  saw  in  § 37  that  the  spin  of  the 
electron  requires  the  wave  function  to  have  two  components.  The 
fact  that  our  present  theory  gives  four  is  due  to  our  wave  equation 
(7)  having  twice  as  many  solutions  as  it  ought  to  have,  half  of  them 
corresponding  to  states  ol  negative  energy. 

With  the  help  of  (9),  the  wave  equation  (7)  may  be  written  with 
three-dimensional  vector  notation 

{Po-Pi(«>  P)-Ps««#  = °-  (10) 

To  generalize  this  equation  to  the  case  when  there  is  an  electro- 
magnetic field  present,  we  follow  the  classical  rule  of  replacing  p0  and 
p by  p0+ejc . A0  and  p-f  e/c.  A,  A0  and  A being  the  scalar  and  vector 
potentials  of  the  field  at  the  place  where  the  electron  is.  This  gives 
us  the  equation 

|p0-|-^A0— Pi(o,p+^-p3mcy  = 0,  (11) 

which  is  the  fundamental  wave  equation  of  the  relativistic  theory  of 
the  electron. 
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The  four  components  of  ifi  in  ( 1 0 ) or  ( 1 1 ) should  be  pictured  as  writter 
one  below  another,  so  as  to  form  a single-column  matrix.  The  squari 
matrices  p and  a then  get  multiplied  into  the  single-column  matrix  <p 
according  to  matrix  multiplication,  the  product  being  in  each  case 
another  single-column  matrix.  The  conjugate  imaginary  wave  func- 
tion that  represents  a bra  should  be  pictured  as  having  its  four  com- 
ponents written  one  beside  another,  so  as  to  form  a single-row  matrix, 
which  can  be  multiplied  from  the  right  by  a square  matrix  p or  a to  give 
another  single-row  matrix.  We  denote  this  conjugate  imaginary  wave 
function  pictured  as  a single-row  matrix  by  using  the  symbol f to 
denote  the  transpose  of  any  matrix,  i.e.  the  result  of  interchanging 
the  rows  and  columns.  Then  the  conjugate  imaginary  of  equation  (11) 
reads  (el  \ . 

P\Po  + lAo-Pi\*,P+lA)-Ptmc\  = 0,  (12) 

in  which  the  operators  p operate  to  the  left.  An  operator  of  differentia- 
tion operating  to  the  left  must  be  interpreted  according  to  (24)  of  § 22. 


68.  Invariance  under  a Lorentz  transformation 

Before  proceeding  to  discuss  the  physical  consequences  of  the  wave 
equation  (11)  or  (12),  we  shall  first  verify  that  our  theory  really  is 
invariant  under  a Lorentz  transformation,  or,  stated  more*  accurately , 
that  the  physical  results  the  theory  leads  to  are  independent  of  the 
Lorentz  frame  of  reference  used.  This  is  not  by  any  means  obvious 
from  the  form  of  the  wave  equation  (11).  We  have  to  verify  that,  if 
we  write  down  the  wave  equation  in  a different  Lorentz  frame,  the 
solutions  of  the  new  wave  equation  may  be  put  into  one-one  corre- 
spondence with  those  of  the  original  one  in  such  a way  that  corre- 
sponding solutions  may  be  assumed  to  represent  the  same  state.  For 
either  Lorentz  frame,  the  square  of  the  modulus  of  the  wave  function, 
summed  over  the  four  components,  should  give  the  probability  per 
unit  volume  of  the  electron  being  at  a certain  place  in  that  Lorentz 
frame.  We  may  call  this  the  probability  density.  Its  values,  calculated 
in  different  Lorentz  frames  for  wave  functions  representing  the  same 
state,  should  be  connected  like  the  time  components  in  these  frames 
of  some  4-vector.  Further,  the  4-dimensional  divergence  of  this  4- 
vector  should  vanish,  signifying  conservation  of  the  electron,  or  that 
the  electron  cannot  appear  or  disappear  in  any  volume  without  passing 
through  the  boundary. 
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For  brevity  it  it  convenient  to  introduce  the  symbol  «,  =1  and  to 
suppose  that  the  suffixes  of  the  tour  «,  <„  > 0,  1 2,  S)  can  be  raised 
in  accordance  with  the  rules  (1),  even  though  these  tour. . do^not 
form  the  components  of  a 4-vector.  We  can  now  w 
equation  (11)  ^(p^elc.A^)— am«lc}<A  — ^ 

The  four  satisfy 


Orf*  = 2^-0 


(14) 


with  defined  by  (2),  as  one  can  verify  by  taking  separately  the  cases 
when  p and  „ are  both  0,  when  one  of  them  is  0,  and  when  neither  of 

tH  Lrt' usapply  an  infinitesimal  Lorentz  transformation  and  dislangmsh 
quantities  referring  to  the  new  frame  of  reference  by  a star.  The 
ponents  of  the  4-vector  will  transform  according  to  equatmns  of 

the  type  p*  = p^+a/p,,  (15) 

where  the  a*  are  small  numbers  of  the  first  order.  We  shall  neglect 
quantities  that  are  quadratic  in  the  o*s  and  thus  of  the  second  order. 
The  condition  for  a Lorentz  transformation  is  that 

p*pn*  = Pjipr, 

which  gives  p^+P^^Pv  — (>’ 

leading  to  *"+**  = «•  , (l) 

The  components  of  will  transform  according  to  the  same  law,  so 

We  ha  p^+elc.Ap  = p*+6lc.A*— a/(l»?+e/c.4?)- 
Thus  the  wave  equation  (13)  becomes 

{(«k  - aV)(P;+e/c  • ^J)  - «m  = °- 


M = \apa  *P*m  ex' 


Define 
Then  from  (14) 

aPatmM-Mam«r  = {apo{(^ocmaP+ocPocm^)<xm<xa- 

— <xpotm(oip  <xm  <xp + oAoim  )} 


(17) 

(18) 


= — apaP 


with  the  help  of  (16),  and  hence 

o^(l  + «mM)  = (l+Jf«.F-Vrf)- 

Thus,  multiplying  (17)  by  (l+.Ma(m)  on  the  left,  we  get 

Ml +ctmM)(p*+elc.Al)-(<xm+M)rrw}>p  = 0. 


(19) 


260 


RELATIVISTIC  THEORY  OF  THE  ELECTRON  §68 

So  if  we  put  (1+a mM)4>  = ip*,  (20) 

we8et  {*H7>*  re/c.A*)~atmmc}<fi*  r.=,  o.  (21) 

This  is  of  the  same  form  as  (13)  with  the  starred  variables  }>*,  A*,  ijA, 
and  shows  that  (13)  is  invariant  under  an  infinitesimal  Lorentz  trans- 
formation, provided  p is  subjected  to  the  right  transformation,  given 
by  (20).  A finite  Lorentz  transformation  can  be  built  up  from  infinite- 
simal ones,  so  under  a finite  Lorentz  transformation  the  wave  equation 
( 13)  is  also  invariant.  Note  that  the  matrices  do  not  get  altered  at  all. 

The  invariance  proved  above  means  that  the  solutions  <p  of  the 
original  wave  equation  (13)  are  in  one-one  correspondence  with  the 
solutions  i/r*  of  the  new  wave  equation  (21),  corresponding  solutions 
being  connected  by  (20).  We  assume  that  corresponding  solutions 
represent  the  same  physical  state.  We  must  now  verify  that  the 
physical  interpretations  of  corresponding  solutions,  referred  to  their 
respective  Lorentz  frames  of  reference,  are  in  agreement.  This  requires 
that  ftij>  should  give  the  probability  density  referred  to  the'original 
fianie  and  the  probability  density  referred  to  the  new  frame. 

Let  us  examine  the  relationship  between  these  quantities.  ftip  is  the 
same  as  ftaPtp  and  forms  one  of  the  four  quantities  ftat^ip,  which  should 
be  trea  ted  together. 

Equations  (18)  and  (16)  show  that  M is  pure  imaginary.  Thus  the 
conjugate  imaginary  of  equation  (20)  is 

" ft' = ft(i-M«j. 

lienee  7 + 

ft' of  ft  = ft('  ~Mam)oP{  1 + um  M)4> 

= ^t(l-i/«m)(l+jfam)(ae--a/a'’)^ 
from  (19).  This  reduces  to 

i/r  = ft(a^—a/oLv)ip 

— ftoc>1ip-\-atil,ftoivi/i 

with  the  help  of  (16).  If  we  lower  the  suffix  p here,  we  get  an  equation 
of  the  same  form  as  (15),  which  shows  that  the  four  quantities  ftot^tp 
transform  like  the  contravariant  components  of  a 4-vector.  Thus  ftip 
transforms  like  the  time  component  of  a 4-vector,  which  is  the  correct 
transformation  law  for  a probability  density.  The  space  components 
of  the  4-vector,  namely  ft  a rip,  if  multiplied  by  c,  give  the  probability 
current,  or  the  probability  of  the  electron  crossing  unit  area  per 
unit  time. 
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It  should  be  noted  that  ftam  <p  is  invariant,  since 
= 

We  must  verify  finally  the  conservation  law,  that  the  divergence 


(22) 

vanishes.  To  prove  this,  multiply  equation  (13)  by  ft  on  the  left. 
The  result  is 

ftoc^(ih  ^ + -ft*m  mc4>  = 0. 

The  conjugate  imaginary  equation  is 


Subtracting  and  dividing  by  ih,  we  get 

ft^pL  + fl^ 

dxP  cx^ 


0, 


which  just  expresses  the  vanishing  of  (22).  In  this  way  we  complete 
the  proof  that  our  theory  gives  consistent  results  in  whichever  frame 
of  reference  it  is  applied. 


69.  The  motion  of  a free  electron 

It  is  of  interest  to  consider  the  motion  of  a free  electron  in  the 
Heisenberg  picture  according  to  the  above  theory  and  to  study  the 
Heisenberg  equations  of  motion.  These  equations  of  motion  can  be 
integrated  exactly,  as  was  first  done  by  Schrodinger.  J For  brevity 
we  shall  omit  the  suffix  t which  the  notation  of  § 28  requires  to  be 
inserted  in  dynamical  variables  that  vary  with  time  in  the  Heisen- 
berg picture. 

As  Hamiltonian  we  must  take  the  expression  which  we  get  as  equal 
to  cp0  when  we  put  the  operator  on  ^ in  (10)  equal  to  zero,  i.e. 

H = cp^a,  p)  + />3rac2  = c(a,  p )+p3mc2.  (23) 

We  see  at  once  that  the  momentum  commutes  with  H and  is  thus  a 
constant  of  the  motion.  Further,  the  xx -component  of  the  velocity  is 

“ [^i,  H]  = Coq.  (24) 

This  result  is  rather  surprising,  as  it  means  an  altogether  different 

} Schrodinger,  Sitzungsb.  d.  Berlin.  Akad.,  1930,  p.  418. 
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relation  between  velocity  and  momentum  from  what  one  has  in 
classical  mechanics.  It  is  connected,  however,  with  the  expression 
tfi  Ccx1  i/j  for  a component  of  the  probability  current.  The  xl  given  by  (24) 
has  as  eigenvalues  ±c,  corresponding  to  the  eigenvalues  ±1  of  <xv 
As  x2  and  x3  are  similar  , we  can  conclude  that  a measurement  of  a com- 
ponent of  the  velocity  of  a free  electron  is  certain  to  lead  to  the  result  ±c. 
This  conclusion  is  easily  seen  to  hold  also  when  there  is  a field  present. 

Since  electrons  are  observed  in  practice  to  have  velocities  con- 
siderably less  than  that  of  light,  it  would  seem  that. we  have  here  a 
contradiction  with  experiment.  The  contradiction  is  not  real,  though, 
since  the  theoretical  velocity  in  the  above  conclusion  is  the  velocity 
at  one  instant  of  time  while  observed  velocities  are  always  average 
velocities  through  appreciable  time  intervals.  We  shall  find  upon 
further  examination  of  the  equations  of  motion  that  the  velocity  is 
not  at  all  constant,  but  oscillates  rapidly  about  a mean  value  which 
agrees  with  the  observed  value. 

It  may  easily  be  verified  that  a measurement  of  a component  of  the 
velocity  must  lead  to  the  result  ±c  in  a relativistic  theory,  simply 
from  an  elementary  application  of  the  principle  of  uncertainty  of 
§ i24-  To  measure  the  velocity  we  must  measure  the  position  at  two 
slightly  different  times  and  then  divide  the  change  of  position  by  the 
time  interval.  (It  will  not  do  to  measure  the  momentum  and  apply 
a formula,  as  the  ordinary  connexion  between  velocity  and  momen- 
tum is  not  valid.)  In  order  that  our  measured  velocity  may  approxi- 
mate to  the  instantaneous  velocity,  the  time  interval  between  the 
two  measurements  of  position  must  be  very  short  and  hence  these 
measurements  must  be  very,  accurate.  The  great  accuracy  with 
which  the  position  of  the  electron  is  known  during  the  time-interval 
must  give  rise,  according  to  the  principle  of  uncertainty,  to  an  almost 
complete  indeterminacy  in  its  momentum.  This  means  that  almost 
all  values  of  the  momentum  are  equally  probable,  so  that  the  momen- 
tum is  almost  certain  to  be  infinite.  An  infinite  value  for  a component 
ol  momentum  corresponds  to  the  value  ±c  for  the  corresponding 
component  of  velocity. 

Let  us  now  examine  how  the  velocity  of  the  electron  varies  with 
time.  We  have 

*»< *i  = oq// — Hot  j. 

Now  since  oq  anticommutes  with  all  the  terms  in  H except  coclPl, 
otlH+Hot1  = ot1c<x1pl+cotlp1ot1  = 2cPl, 


§69 

and  hence 
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— 2eqi/  2c^)jj 

= — 2Hoc1+2cpv 


(25) 


Since  H and  pl  are  constants,  it  follows  from  the  first  of  equations 
(25>that  tfcq  = 2a,  H.  (26) 

This  differential  equation  in  dt  can  be  integrated  immediately,  the 
result  being  & _ do e-2tHW  (27) 

where  d°  is  a constant,  equal  to  the  value  of  dx  when  t — 0.  The 
factor  must  be  put  to  the  right  of  the  factor  d'(  in  (27)  on 

account  of  the  H occurring  to  the  right  of  the  dx  in  (26).  The  second 
of  equations  (25)  leads  in  the  same  way  to  the  result 

dq  = e2if1llha.x. 


We  can  now  easily  complete  the  integration  of  the  equation  of  motion 
for  xv  From  (27)  and  the  first  of  equations  (25) 

oq  = \iUle-iiH<lhH-'+cpiH-\  (28) 

and  hence  the  time-integral  of  equation  (24)  is 

xx  = -\cW<x\  H~H+av  (29) 


ax  being  a constant. 

From  (28)  we  see  that  the  xx  component  of  velocity,  ccq,  consists 
of  two  parts,  a constant  part  c2p1H~1,  connected  with  the  momentum 
by  the  classical  relativistic  formula,  and  an  oscillatory  part 

whose  frequency  is  high,  being  2Hjh,  which  is  at  least  2mc'ijh.  Only 
the  constant  part  would  be  observed  in  a practical  measurement  of 
velocity,  such  a measurement  giving  the  average  velocity  through  a 
time-interval  much  larger  than  A/2mc2.  The  oscillatory  part  secures 
that  the  instantaneous  value  of  xx  shall  have  the  eigenvalues  ±c.  The 
oscillatory  part  of  xx  is  small,  being,  according  to  (29), 

= lich(<x1—cplH~1)H-1, 

which  is  of  the  order  of  magnitude  h/mc,  since  (oc1—cplH~1)  is  of  the 
order  of  magnitude  unity. 


70.  Existence  of  the  spin 

In  § 67  we  saw  that  the  correct  wave  equation  for  the  electron  in 
the  absence  of  an  electromagnetic  field,  namely  equation  (7)  or  (10),  is 
equivalent  to  the  wave  equation  (6)  which  is  suggested  from  analogy- 
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with  the  classical  theory.  This  equivalence  no  longer  holds  when 
thereds  a field.  The  wave  equation  to  be  expected  from  analogy  with 
the  classical  theory  in  this  case  is 

{^o  + ^o)2-(p  + ”Aj2-m^  = 0,  (30) 

in  which  the  operator  is  just  the  classical  relativistic  Hamiltonian.  If 
we  multiply  (11)  by  some  factor  on  the  left  to  make  it  resemble 
(30)  as  closely  as  possible,  namely  the  factor 

Po  + ^o+P1(«>P  + -Aj+/as?nc, 

we  get  0 ' c / 

{(pi  r‘cA°)  ”~(a,P  + cA)  -™2c2-Pi  (po  + ^o)(°.P  + ^AJ- 

(°>  A)(^o  + ~A0)]}</'  — °.  (31) 

We  now  use  the  general  formula  that,  if  B and  C are  any  two 
three-dimensional  vectors  that  commute  with  a, 

(a,  B)(a,  C)  = I {of  B& +oxoiBl 0,+a,^  JS.OJ, 

123 

the  summation  referring  to  cyclic  permutations  of  the  suffixes  1,  2,  3, 
or  (o,B)(o,C)  = (B,C  )+iJ,^B1C2-B2C1) 

123 

= (B,C)+t(o,BxC).  (32) 

Taking  B = C = p + e/c.A,  we  find,  since 

(p  + >)  X (p  + ^A)  = ^{PXA+Axp} 

= —ihe/c. curl  A = — ihe/c.J ¥, 
where  is  the  magnetic  field,  that 

(o,p  + ^Aj2=  (pH"AJ*+j  («,•#).  (33) 

Also  we  have 

(*»  + lAo)(a’  P + C AHa’  P + C A)(P«  + cA°) 

= %,p0A-Ap0+H0p-pA0) 

(/ 
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where  & is  the  electric  field.  Thus  (31)  becomes 

|/p0  + ®J0j  ~ (p  + “ a|  —m2c2  — ^ + {o,  — 0. 

' C ' C ° (34) 

This  equation  differs  from  (30)  through  having  two  extra  terms  in 
the  operator.  These  extra  terms  involve  some  new  physical  effects, 
but  since  they  are  not  real  they  do  not  lend  themselves  very  directly 
to  physical  interpretation. 

To  get  an  understanding  of  the  physical  features  involved  in  the 
difference  between  (34)  and  (30)  it  is  better  to  work  with  the  Heisen- 
berg picture,  this  picture  being  always  the  more  suitable  one  for 
comparisons  between  classical  and  quantum  mechanics.  The  Heisen- 
berg equations  of  motion  are  determined  by  tile  Hamiltonian 

H = — eA0+cp^a,  p-f  ® Aj  + p3mc2,  (35) 


the  generalization  of  (23)  to  the  case  when  there  is  a field.  Equation 
(35)  gives 

(f 

= (a,  P + ~a|  + m2c2 

= (p  + ~A  J + mW  + ^(a,Jt)  (36) 

with  the  help  of  (33).  We  have  here  the  real  part  of  the  extra  terms 
in  (34)  appearing  without  the  pure  imaginary  part.  For  an  electron 
moving  slowly  (i.e.  with  small  momentum),  we  may  expect  the 
Heisenberg  equations  of  motion  to  be  determined  by  a Hamiltonian 
of  the  form  mc2+Hv  where  Hl  is  small  compared  with  me2.  Putting 
for  H in  (36)  and  neglecting  H\  and  other  terms  involving 
c"2,  we  get,  on  dividing  by  2m, 


The  Hamiltonian  Ht  given  by  (37)  is  the  same  as  the  classical 
Hamiltonian  for  a slow  electron,  except  for  the  last  term 


he 
2 me 


(«»•#)• 


This  term  may  be  considered  as  an  additional  potential  energy 
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which  a slow  electron  has  in  the  quantum  theory  and  may  be 
interpreted  as  arising  from  the  electron  having  a magnetic  moment 
hej'lmc .or.  This  magnetic  moment  is  the  one  assumed  in  §§  41  and 
47  for  dealing  with  the  Zeeman  effect  and  is  in  agreement  with 
experiment. 

The  spin  angular  momentum  does  not  give  rise  to  any  potential 
energy  and  therefore  does  not  appear  in  the  result  of  the  preceding 
calculation.  The  simplest  way  of  showing  the  existence  of  the  spin 
angular  momentum  is  to  take  the  case  of  the  motion  of  a free  electron 
or  an  electron  in  a central  field  of  force  and  determine  the  angular 
momentum  integrals.  This  means  working  with  the  Hamiltonian  (23), 
or  with  the  Hamiltonian  (35)  with  A = 0 and  A0  a function  of  the 
radius  r,  i.e.  H — —eA0(r)+cpl{a,p)+p3mc‘l,  (38) 

and  obtaining  the  Heisenberg  equations  of  motion  for  the  angular 
momentum.  With  either  Hamiltonian  we  find  for  the  rate  of  change 
of  the  aq-component  of  orbital  angular  momentum,  mx  = xtp3—x3p2, 
with  the  help  of  commutation  relations  proved  in  § 35, 
ihml  — mlH—Hm1 

= CP1W0,  p)-(a,  p)TOi) 

= cpl(a,m1  p— p mx) 

= iflCPl\?2P3—°-iP2}- 

Ihus  rh1  0 and  the  orbital  angular  momentum  is  not  a constant 
of  the  motion.  This  result  is  to  be  expected  from  the  integrated 
equation  of  motion  (29),  the  oscillatory  part  of  the  motion  here  dis- 
played giving  rise  to  an  oscillatory  term  in  the  angular  momentum. 
We  have  further 

iha1  — oq  H — Ho1 

= cPiW°,  p)— («,  pK} 

= cPx(o1a~ ctctj,  p) 

= 2icPl{a3Pz-°zP3} 

with  the  help  of  equations  (51)  of  § 37.  Hence 

TOj-fl/kjq  = 0, 

so  that  the  vector  m-f  \htj  is  a constant  of  the  motion.  This  result 
one  can  interpret  by  saying  the  electron  has  a spin  angular  momentum 
\ha,  which  must  be  added  to  the  orbital  angular  momentum  m before 
one  gets  a constant  of  the  motion.  The  spin  angular  momentum 
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could  alternatively  be  obtained  from  the  rotation  operators  for  states 
of  spin  in  accordance  with  the  general  method  of  § 35. 

The  same  vector  o fixes  the  directions  of  both  the  spin  magnetic 
moment  and  the  spin  angular  momentum.  If  an  electron  in  a certain 
state  of  spin  has  a spin  angular  momentum  of  \h  in  a particular 
direction,  it  will  have  a magnetic  moment  —efi/2mc  in  the  same 
direction. 

We  were  led  to  the  value  \h  for  the  spin  of  the  electron  by  an 
argument  depending  simply  on  general  principles  of  quantum  theory 
and  relativity.  One  could  apply  the  same  argument  to  other  kinds 
of  elementary  particle  and  one  would  be  led  to  the  same  conclusion, 
that  the  spin  angular  momentum  is  half  a quantum.  This  would  be 
satisfactory  for  the  proton  and  the  neutron,  but  there  are  some  kinds 
of  elementary  particle  (e.g.  the  photon  and  certain  kinds  of  meson) 
whose  spins  are  known  experimentally  to  be  different  from  \Ti,  so  we 
have  a discrepancy  between  our  theory  and  experiment. 

The  answer  is  to  be  found  in  a hidden  assumption  in  our  work. 
Our  argument  is  valid  only  provided  the  position  of  the  particle  is 
an  observable.  If  this  assumption  holds,  the  particle  must  have  a 
spin  angular  momentum  of  half  a quantum.  For  those  particles  that 
have  a different  spin  the  assumption  must  be  false  and  any  dynamical 
variables  xv  x2,  x3  that  may  be  introduced  to  describe  the  position 
of  the  particle  cannot  be  observables  in  accordance  with  our  general 
theory.  For  such  particles  there  is  no  true  Schrodinger  representation. 
One  might  be  able  to  introduce  a quasi  wave  function  involving  the 
dynamical  variables  xlt  x2,  x3,  but  it  would  not  have  the  correct 
physical  interpretation  of  a wave  function — that  the  square  of  its 
modulus  gives  the  probability  density.  For  such  particles  there  is  still 
a momentum  representation,  which  is  sufficient  for  practical  purposes. 

71.  Transition  to  polar  variables 

For  the  further  study  of  the  motion  of  an  electron  in  a central  field 
of  force  with  the  Hamiltonian  (38),  it  is  convenient  to  make  a 
transformation  to  polar  coordinates,  as  was  done  in  § 38  in  the 
non-relativistic  case.  We  can  introduce  r and  pr  as  before,  but 
instead  of  k,  the  magnitude  of  the  orbital  angular  momentum  m, 
which  is  no  longer  a constant  of  the  motion,  we  must  now  use  the 
magnitude  of  the  total  angular  momentum  M = m+ite.  Let  us  put 
jW  = Ml+Ml+M23+ih2.  (39) 
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The  eigenvalues  of  m3  are  integral  multiples  of  A,  those  of  \ha3  are 
and  hence  those  of  M3  must  be  half  odd  integral  multiples  of 
h.  It  follows  from  the  theory  of  §36  that  the  eigenvalues  of  JjT | must 
be  integers  greater  than  zero. 

If  in  formula  (32)  we  take  B — C = m,  we  get 

(or,  m)2  = m2-f  i(c,  m x m) 

= m2— h(a,  m) 

= (m+fA«)2-2A(a,m)-fA2. 

Hence  {(a,  m)-|-A}2  = M2+iA2. 

Thus  (a,  m )-\-h  is  a quantity  whose  square  is  1M2-|- 1 A2  and  we  could, 
consistently  with  equation  (39),  define  jA  as  (a,m )+h.  This  would 
not  be  the  most  convenient  definition  for  jt  however,  since  we  would 
like  to  have  j a constant  of  the  motion  and  (a,  m )-\-h  is  not  constant. 
We  have,  in  fact,  from  applications  of  (32), 

(a,  m)(o,  p)  = i(a,  m x p) 

anc*  (o,p)(d,m)  = }(s,pxm), 

so  that 

(a,  m)(a,  p)  + (o,  p)(a,  m)  = i ^a1{m2p3-~m3p2+p2m3~p3mi} 

= i 2>,.2itfpi  = — 2A(a,  p), 

123 

or  {(®.  m)+A}(o,  p)-f(a,  p){(o,  m)+A}  = 0. 

Thus  (a,  m)  + A anticommutes  with  one  of  the  terms  in  the  expression 
(38)  for  //,  namely  the  term  cpt(a,  p),  and  commutes  with  the  other 
two.  It  follows  that  p3{(o,  m)-f-A}  commutes  with  all  the  three  terms 
in  //  and  is  a constant  of  the  motion.  But  the  square  of  p3{(a,  m)+A} 
is  also  M2-j- JA2.  We  can  therefore  take 

3h  = p3{(o,  m)-f  A},  (40) 

which  gives  us  a convenient  rational  definition  for  j which  is  consis- 
tent with  (39)  and  makes  j a constant  of  the  motion.  The  eigenvalues 
of  this  j are  all  positive  and  negative  integers,  excluding  zero. 

By  a further  application  of  (32),  we  get 

(a,  x)(o,  p)  = (x,  p)-f i(a,  m) 

= rpT+ip3jh-ih,  (41) 

with  the  help  of  (40)  and  also  of  equation  (58)  of  §38.  We  introduce 
the  linear  operator  e defined  by 

re  = Pl(a,  x). 


(42) 
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Since  r commutes  with  Pl  and  with  (a,  x),  it  must  commute  with  e. 
We  thus  have 

r2e2  = [Pl(a,  x)]2  = (a,  x)2  = x2  = r2, 
or  e2  = 1 . 

Now  pi(o,  p)  commutes  with  y , and  since  there  is  symmetry  between 
x and  p so  far  as  angular  momentum  is  concerned,  p,  (a,  x)  must  also 
commute  withy.  Hence  e commutes  withy.  Further,  e must  commute 
with  pr,  since  we  have 

(a,  x)(x,  p)  (x,  p)(o,  x)  = (a,  x(x,  p)  — (x,  p)x)  = ih(a,  x), 

which  gives  rerpr — rpTrt  = ihre, 

or  r2epT—r2pr€  = 0. 

From  (41)  and  (42)  we  obtain 

rep  i(«,p)  = rpr+ip3jh—ih, 

or  Pl(a , p)  = e(pr — ihj r) + iepzjhjr. 

Thus  (38)  becomes 

H\c  — —e/c.A0+e(pr—ihlr)+i€p3jh/r+p3mc. 

This  gives  our  Hamiltonian  expressed  in  terms  of  polar  variables.  It 
should  be  noticed  that  « and  p3  commute  with  all  the  other  variables 
occurring  in  H and  anticommute  with  one  another.  This  means  that 
we  can  take  a representation  with  p3  diagonal  in  which  e and  p3  are 
represented  respectively  by  the  matrices 

(«  !)•  (o  _!) 

If  r is  also  diagonal  in  the  representation,  the  representative 
<Vp'3|>  of  a ket  will  have  two  components,  <r',  1|)  = :nid 

<r'(  — 1|)  = i pb(r')  say,  referring  to  the  two  rows  and  columns  of  the 
matrices  (43). 

72.  The  fine-structure  of  the  energy-levels  of  hydrogen 

We  shall  now  take  the  case  of  the  hydrogen  atom,  for  which  ^40  = e/r, 
and  work  out  its  energy-levels,  given  by  the  eigenvalues  H'  of  FI. 
The  equation  (H'—H)  |>  0 which  defines  these  eigenvalues,  when 

written  in  terms  of  representatives  in  the  representation  discussed 


270 


RELATIVISTIC  THEORY  OF  THE  ELECTRON 


§ 72 


above  with  e and  p3  represented  by  the  matrices  (43),  gives  the 
equations 

(t  + + ^ = °> 


(v+^b~n(t+~^a+j7^a+mc,pb  ~ °- 


If  we  put  -JTT 

mc—H/c 

these  equations  reduce  to 


ft 


mc-\-H'/c 
3 + 


= a„ 


a+i)A' 


dr 

a 

ar 


(44) 


(45) 


•I’d  — o, 


where  a = e2/fic,  which  is  a small  number.  We  shall  solve  these  equa- 
tions by  a similar  method  to  that  used  for  equation  (73)  in  § 39. 

Put  </rn  = r~1e~rlaf,  >pb  = r~le~r!ag,  (46) 

introducing  two  new  functions,  / and  g,  of  r,  where 

a = (ala2)i  = h(m2c2—H'2/c2)-l.  (47) 

Equations  (45)  become 


!_“)/.  J»l+i)g  = o, 

ai  r)  \^r  a rl 

- + °- 
a2  rj  \or  a r] 


(48) 


We  now  try  for  a solution  in  which  / and  g are  in  the  form  of  power 

series,  /=2c/,  g = 2*.f,  (49) 

8 8 

in  which  consecutive  values  of  s differ  by  unity  though  these  values 
need  not  be  integers.  Substituting  these  expressions  for  / and  g in 
(48)  and  picking  out  coefficients  of  r*_1,  w.e  obtain 

cs-i/ai— “c*— 

(*— i)c»+cs-x/o  = o. 

By  multiplying  the  first  of  these  equations  by  a and  the  second  by 
a,2  and  subtracting,  we  eliminate  both  cs_j  and  c's_ j,  since  from 
(47)  a/al  — aja.  We  are  left  with 

[a<x — a2(s  —j)]Ce + [«2  “ + a («  + j)]Cg  = 0, 


(50) 


(51) 
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a relation  which  shows  the  connexion  between  the  primed  and  un- 
primed  c’s. 

The  boundary  condition  at  r = 0 requires  that  nfia  and  r<j>h  —>  0 as 
r -=►  0,  so  from  (46)  / and  g 0 as  r ->  0.  Thus  the  series  (4!))  must 
terminate  on  the  side  of  small  s.  If  s0  is  the  minimum  value  of  s for 
which  c8  and  c's  do  not  both  vanish,  we  obtain  from  (50),  by  putting 
* = «o  and  c*,-i  = <o-i  = 0, 

“C«o+(S0+iXo  = 0>  \ (52) 

= 0,  1 

which  give  a2  = — s2+J2. 

Since  the  boundary  condition  requires  that  the  minimum  value  of  .s- 
shall  be  greater  than  zero,  we  must  take  '* 

«o  = +V0'2“a2)- 

To  investigate  the  convergence  of  the  series  (49)  we  shall  determine 
the  ratio  cg/cs_1  for  large  s.  Equation  (51 ) and  the  second  of  equations 
(50)  give  approximately,  when  s is  large, 

a2cg  = acs 

and  scs  = Cg^/a+c'^Ja^. 

Hence  c«/c«-i  = '2/as. 

The  series  (49)  will  therefore  converge  like 

or  e2rln.  This  result  is  similar  to  that  obtained  in  § 39  and  allows  us 
to  infer,  as  in  §39,  that  all  values  of  H’  are  permissible  for  which  a 
is  pure  imaginary,  i.e.  from  (47),  for  which  H'  > me2,  while  for 
H'  < me2  we  take  a to  be  positive  and  then  find  that  only  those 
values  of  H'  are  permissible  for  which  the  series  (49)  terminate  on 
the  side  of  large  s. 

If  the  series  (49)  terminate  with  the  terms  cs  and  c's,  so  that 
cs+i  = c'g+l  = 0,  we  obtain  from  (50)  with  s+ 1 substituted  for  s 

CsK +c'Ja  = 0, 

</«2 +cs/a  ==  0. 

These  two  equations  are  equivalent  on  account  of  (47).  When  com- 
bined with  (51),  they  give 

eqfaoc— a2(s— j)]  = a[«2a+a(s+j)]. 
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which  reduces  to  2 a1a2s  — a(ax— a2)a> 


with  the  help  of  (44).  Squaring  and  using  (47),  we  obtain 
s\m2c2-H'2lc2)  = oAEH/c2. 


Hence 


The  s here,  which  specifies  the  last  term  in  the  series,  must  be  greater 
than  s0  by  some  integer  not  less  than  zero.  Calling  this  integer  n, 
we  have 


* = w+V(j2— *2) 


and  thus 


H' 
me 2 


f{n+V02-*2)}2 


(54) 


This  formula  gives  the  discrete  energy-levels  of  the  hydrogen 
spectrum  and  was  first  obtained  by  Sommerfeld  working  with  Bohr’s 
orbit  theory.  There  are  two  quantum  numbers  n and  j involved,  but 
owing  to  a2  being  very  small  the  energy  depends  almost  entirely  on 
n -f~  \j \ . Values  of  n and  |j|  that  give  the  same  n+\j\  give  rise  to  a 
set  of  energy-levels  lying  very  close  to  one  another,  and  to  the 
energy-level  given  by  the  non -relativistic  formula  (80)  of  § 39  with 
s — n-\-  \j\,  apart  from  the  constant  term  me2. 

We  used  equations  (53)  by  combining  them  with  (51),  but  this  does 
not  make  full  use  of  (53)  since  the  coefficients  of  cg  and  c's  in  (51)  may 
both  vanish.  In  this  case  we  get,  multiplying  the  first  coefficient  by 
a j and  the  second  by  a and  adding, 

a(a1+a2)<x-\-2alaJ  = 0. 

Thus  j must  be  negative  in  this  case.  With  the  help  of  (44)  and  (47) 
we  get  further 


2 j _ a + a 
a.  a a , 


2 mca 


2 me 


h (m2c2-H'2lc2)i’ 


or 


m2c*  j 

Since  H'  must  be  positive,  this  leads  to 

H'  __  VO’2-*2) 


mcc 


lit 


(55) 


which  is  the  value  of  H'  given  by  (54)  when  n = 0.  The  case  n = 0 
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with  j negative  thus  needs  further  investigation  to  see  whether  the 
conditions  (53)  are  then  fulfilled. 

With  n = 0,  the  maximum  value  of  s is  the  same  as  the  minimum, 
so  equations  (53)  with  s0  substituted  for  s should  agree  with  (52). 
Now  (55)  gives,  from  (44)  and  (47), 

1 _ wicL  V(i2— “2)\  l _ me  a 

ai  ~ h l \j\  / a~~  h \j\’ 

so  the  first  of  equations  (53)  with  s0  substituted  for  s gives 

C8o{\j\—JU2  — “2)}  + CS0“  = 0. 

This  agrees  with  the  second  of  equations  (52)  only  if  j is  positive. 
We  can  conclude  that,  for  n — 0,  j must  be  a/  positive  integer,  while 
for  the  other  values  of  n all  non-zero  integral  values  of  j are  allowed. 

73.  Theory  of  the  positron 

It  has  been  mentioned  in  § 67  that  the  wave  equation  for  the  elec- 
tron admits  of  twice  as  many  solutions  as  .it  ought  to,  half  of  them 
referring  to  states  with  negative  values  for  the  kinetic  energy  cp0+  eA0. 
This  difficulty  was  introduced  as  soon  as  we  passed  from  equation  (5) 
to  equation  (6)  and  is  inherent  in  any  relativistic  theory.  It  occurs 
also  in  classical  relativistic  theory,  but  is  not  then  serious  since,  owing 
to  the  continuity  in  the  variation  of  all  classical  dynamical  variables, 
if  the  kinetic  energy  cp0+eA0  is  initially  positive  (when  it  must  be 
greater  than  or  equal  to  me2),  it  cannot  subsequently  be  negative 
(when  it  would  have  to  be  less  than  or  equal  to  —me2).  In  the 
quantum  theory,  however,  discontinuous  transitions  may  take  place, 
so  that  if  the  electron  is  initially  in  a state  of  positive  kinetic  energy 
it  may  make  a transition  to  a state  of  negative  kinetic  energy.  It  is 
therefore  no  longer  permissible  simply  to  ignore  the  negative-energy 
states,  as  one  can  do  in  the  classical  theory. 

Let  us  examine  the  negative-energy  solutions  of  the  equation 


a little  more  closely.  For  this  purpose  it  is  convenient  to  use  a repre- 
sentation of  the  a’s  in  which  all  the  elements  of  the  matrices  repre- 
senting oq,  a2,  and  a3  are  real  and  all  those  of  the  matrix  representing 


“s^s  + '^sj' 


ammc  U = 0 (56) 
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a m are  pure  imaginary  or  zero.  Such  a representation  may  be  obtained, 
for  instance,  from  that  of  § 67  by  interchanging  the  expressions  for  <*2 
and  oim  in  (9).  If  equation  (56)  is  expressed  as  a matrix  equation  in 
this  representation  and  we  put  — i for  i all  through  it,  we  get,  remem- 
bering the  i in  (4), 

{(-Po+ecAo)-^-Pi+l^i)~ 

-famwcji£  = 0.  (57) 

Thus  each  solution  ip  of  the  wave  equation  (56)  has  for  its  conjugate 
complex  i J>  a solution  of  the  wave  equation  (57).  Further,  if  the  solution 
<A  of  (56)  belongs  to  a negative  value  for  cp0-\-eA0,  the  corresponding 
solution  i ft  of  (57)  will  belong  to  a positive  value  for  cp0—eA0.  But  the 
operator  in  (57)  is  just  what  one  would  get  if  one  substituted  — e for  e 
in  the  operator  in  (56).  It  follows  that  each  negative-energy  solution 
of  (56)  is  the  conjugate  complex  of  a positive-energy  solution  of  the 
wave  equation  obtained  from  (56)  by  substitution  of  — e for  e,  which 
solution  represents  an  electron  of  charge  -fe  (instead  of  ~e,  as  we 
had  up  to  the  present)  moving  through  the  given  electromagnetic  field. 
Thus  the  unwanted  solutions  of  (56)  are  connected  with  the  motion 
of  an  electron  with  a charge  +e.  (It  is  not  possible,  of  course,'  with 
an  arbitrary  electromagnetic  field,  to  separate  the  solutions  of  (56) 
definitely  into  those  referring  to  positive  and  those  referring  to  negative 
values  for  cp0-\-eA0,  as  such  a separation  would  imply  that  transitions 
from  one  kind  to  the  other  do  not  occur.  The  preceding  discussion  is 
therefore  only  a rough  one,  applying  to  the  case  when  such  a separation 
is  approximately  possible.) 

In  this  way  we  are  led  to  infer  that  the  negative-energy  solutions 
of  (56)  refer  to  the  motion  of  a new  kind  of  particle  having  the  mass 
of  an  electron  and  the  opposite  charge.  Such  particles  have  been 
observed  experimentally  and  are  called  positrons.  We  cannot,  how- 
ever, simply  assert  that  the  negative-energy  solutions  represent  posi- 
trons, as  this  would  make  the  dynamical  relations  all  wrong.  For 
instance,  it  is  certainly  not  true  that  a positron  has  a negative  kinetic 
energy.  We  must  therefore  establish  the  theory  of  the  positrons  on 
a somewhat  different  footing.  We  assume  that  nearly  all  the  negative- 
energy  states  are  occupied,  with  one  electron  in  each  state  in  accordance 
with  the  exclusion  principle  of  Pauli.  An  unoccupied  negative-energy 
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state  will  now  appear  as  something  with  a positive  energy,  since  to 
make  it  disappear,  i.e.  to  fill  it  up,  we  should  have  to  add  to  it  an 
electron  with  negative  energy.  We  assume  that  these  unoccupied 
negative-energy  states  are  the  positrons. 

These  assumptions  require  there  to  be  a distribution  of  electrons 
of  infinite  density  everywhere  in  the  world.  A perfect  vacuum  is  a 
region  where  all  the  states  of  positive  energy  are  unoccupied  and  all 
those  of  negative  energy  are  occupied.  In  a perfect  vacuum  Maxwell ’s 

equation  div£  = 0 

must,  of  course,  be  valid.  This  means  that  the  infinite  distribution 
of  negative-energy  electrons  does  not  contribute  to  the  electric  field. 
Only  departures  from  the  distribution  in  a vacuum  will  contribute 
to  the  electric  density  j0  in  Maxwell’s  equation 

div  € = 4-nj0.  (58) 

Thus  there  will  be  a contribution  —e  for  each  occupied  state  of  posi- 
tive energy  and  a contribution  -f  e for  each  unoccupied  state  of 
negative  energy. 

The  exclusion  principle  will  operate  to  prevent  a positive -energy 
electron  ordinarily  from  making  transitions  to  states  of  negative 
energy.  It  will  still  be  possible,  however,  for  such  an  electron  to 
drop  into  an  unoccupied  state  of  negative  energy.  In  this  case  we 
should  have  an  electron  and  positron  disappearing  simultaneously, 
their  energy  being  emitted  in  the  form  of  radiation.  The  converse 
process  would  consist  in  the  creation  of  an  electron  and  a positron 
from  electromagnetic  radiation. 

From  the  symmetry  between  occupied  and  unoccupied  fermion 
states  discussed  at  the  end  of  § 65,  the  present  theory  is  essentially 
symmetrical  between  the  electrons  and  the  positrons.  We  should 
have  an  equivalent  theory  if  we  supposed  the  positrons  to  be  the 
basic  particles,  described  by  wave  equations  of  the  form  (11)  with  — e 
for  e,  and  then  supposed  that  nearly  all  the  states  of  negative  energy 
for  the  positrons  are  filled  up,  a hole  in  the  distribution  of  negative- 
energy  positrons  being  then  interpreted  as  an  ordinary  electron.  The 
theory  could  be  developed  consistently  with  the  hypothesis  that  all 
the  laws  of  physics  are  symmetrical  between  positive  and  negative 
electric  charge. 
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74.  The  electromagnetic  field  in  the  absence  of  matter 

The  theory  of  radiation,  that  was  set  up  in  Chapter  X involved  some 
approximations  in  its  handling  of  the  interaction  of  the  radiation 
with  matter.  The  object  of  the  present  chapter  is  to  remove  these 
approximations  and  get,  as  far  as  possible,  an  accurate  theory  of  the 
electromagnetic  field  interacting  with  matter,  subject  to  the  limitation 
that  the  matter  consists  only  of  electrons  and  positrons.  Too  little  is 
known  about  other  forms  of  matter,  protons,  neutrons,  etc.,  for  one 
to  attempt  at  the  present  time  to  get  an  accurate  theory  of  their 
interaction  with  the  electromagnetic  field.  But  there  exists  a precise 
theory  of  electrons  and  positrons,  as  given  in  the  preceding  chapter, 
which  one  can  use  for  building  up  a precise  theory  of  the  interaction 
of  the  electromagnetic  field  with  this  form  of  matter.  The  theory 
must  bring  in  the  interaction  of  the  electrons  and  positrons  with  one 
another',  through  their  Coulomb  forces,  as  well  as  their  interaction 
with  electromagnetic  radiation,  and  it  must,  of  course,  conform  to 
special  relativity.  For  brevity  in  this  chapter  we  shall  take  c = 1. 

We  must  first  consider  the  electromagnetic  field  without  interaction 
with  matter.  Now  in  § 63  we  set  up  first  a treatment  of  the  field  of 
radiation  without  interaction  of  matter.  Dynamical  variables  were 
there  introduced  to  describe  the  field,  commutation  relations  were 
established  for  them,  and  a Hamiltonian  was  found  which  made  them 
vary  correctly  with  the  time.  No  approximations  were  made  in  this 
piece  of  work.  The  resulting  theory  would  therefore  be  a satisfactory, 
exact  theory  of  radiat  ion  without  interaction  with  matter , were  it  not 
for  one  feature  in  it,  namely  our  taking  the  scalar  potential  to  be  zero. 
This  feature  spoils  the  relativistic  form  of  the  theory  and  makes  it 
unsuitable  as  a starting-point  from  which  to  develop  a precise  theory 
of  the  electromagnetic  field  in  interaction  with  matter. 

We  must  therefore  extend  the  treatment  of  § 63  by  leaving  A0 
general  and  bringing  it  into  the  work  along  with  the  other  potentials 
Av  A2>  A3.  Thus  we  shall  have  the  four  A^  and  they  will  satisfy,  as 
the  generalization  of  (62)  of  § 63, 

nAp  = 0,  dA^ldXp  = o.  (1),  (2) 

For  the  present  we  shall  ignore  the  second  of  these  equations. 


V * 


THE  ELECTROMAGNETIC  FIELD 


277 


§ 74 


For  the  present  we  shall  ignore  the  second  of  these  equations  and 
work  only  from  the  first. 


Equation  (1)  shows  that  each  A can  be  resolved  into  waves 
travelling  with  the  velocity  of  light.  Thus,  corresponding  to  equation 


(63)  of  §63, 


Ap(x) 


f (4'm ke 


ik.x 


+ d^k  e 


- ik.x  \ 


d3A-, 


(3) 


where  k.x  denotes  the  four-dimensional  scalar  product 

lc.x  — ku. r0--{k,x), 

kv  being  the  4 vector  whose  space  components  are  the  same  as  the 
components  of  the  three-dimensional  vector  k of  § 63  and  whose  time 
component  k0  — jk|,  and  d3k  denotes  dkl  dk2  d k2,  as  in  § 63.  The  index 
c in  the  coefficients  A'^  indicates  that  they  act*  constant  in  time.  We 
shall  later  introduce  some  other  Fourier  coefficients  A ^k,  not  constant 
in  time,  which  must  be  distinguished  from  the  present  ones. 

The  Fourier  component  A ^k  has  a part  A ;)k  coming  from  J0(x)  and 
a part  .4£k  (r  = F 2, 3)  which  is  a three-dimensional  vector.  The  latter 
can  be  decomposed  into  two  parts,  a longitudinal  part  lying  in  the 
direction  of  k,  the  direction  of  motion  of  the  waves,  and  a transverse 
part  perpendicular  to  k.  The  longitudinal  part  is  krkjk0"  ,Acsk.  The 
transverse  part  is 

(Ks-Kkslk  <?)Arsk  = js/rr  k,  (4) 


say.  It  satisfies  krJ tfrrk  = 0. 


(5) 


It  is  known  from  the  Maxwell  theory  of  light  that  only  the  trans- 
verse part  is  effective  for  giving  electromagnetic  radiation.  Chapter  X 
dealt  only  with  this  transverse  part,  the  A rk  of  § 63  being  the  same  as 
the  present  and  equation  (65)  of  § 63  corresponding  to  the  present 
equation  (5).  Nevertheless,  the  longitudinal  part  cannot  be  neglected 
in  a complete  theory  of  electrodynamics  because  of  its  connexion 
with  the  Coulomb  forces,  as  will  show  up  later. 

We  can  now  decompose  the  three-dimensional  vector  Ar{x)  into 
two  parts,  a transverse  part  and  a longitudinal  part.  The  former  is 


s/r(x)  = J (,a?crkeikj-\-  ,^  r*J)  d3k 


and  satisfies  dstfr(x)ldxr  = ■-  0.  (6) 

The  longitudinal  part  may  be  expressed  as  the  gradient  dV/8xr  of  a 
scalar  V given  by  , 

V = i J kjk02 . (^Jk  eik  x~Acsk e~ik*)  d3k.  (7): 

Thus  Ar  = s/r+SV!Bxr.  (rd 

T 
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The  magnetic  field  is  determined  by  the  transverse  part  of  Ar, 

. = curl  A = curl 

It  is  convenient  to  count  A0{x)  as  longitudinal,  so  that  the  complete 
potentials  A^x)  are  separated  into  a transverse  part  J&T{x)  and  a 
longitudinal  part  A0,  8V]8xr.  This  separation,  of  course,  refers  to  a 
particular  Lorentz  frame  of  reference  and  must  not  be  used  when  one 
wants  to  keep  one’s  equations  in  a relativistic  form. 

Each  Fourier  coefficient  A £k  occurs  in.  (3)  combined  with  the  time 
factor  The  product 

A^e***  = A^  (9) 

say,  forms  a Hamiltonian  dynamical  variable  in  classical  mechanics 
and  a Heisenberg  dynamical  variable  in  quantum  mechanics,  like  the 
the  Akl(  of  §63. 

The  work  of  § 63  gives  us  the  P.B.  relations  for  the  transverse  part 
of  A k.  To  connect  up  with  it,  we  pass  over  to  discrete  k-values  in 
three-dimensional  k-space  and  take,  for  example,  a particular  discrete 
k- value  for  which  Aq  = k%  = 0,  Aq  = k0  > 0.  Then  the  polarization 
variable  1 can  take  on  two  values  referring  to  the  two  directions  1 
and  2 and  equation  (73)  of  § 63  gives,  with  the  help  of  the  commutation 
relations  for  the  tj’s  and  tj’s,  equations  (11)  of§  60, 


[A  1W>  A lk]  = [A*,  A2k]  = (10) 

The  work  of  §63  gives  us  no  information  about  A 3k  and  A ok. 

However,  we  can  now  obtain  the  P.B.  relations  for  Azk  and  Hok 
from  the  theory  of  relativity.  Equations  (10)  have  to  be  built  up  into 
a i elativistic  set  and  the  only  simple  way  of  doing  so  is  by  adding  to 
them  the  two  further  equations 

[J3k,  -43ki  -----  ~[Aok,  Aok]  - i.sk/4rr2.('0,  (11) 

so  that  the  four  equations  (10)  and  (11),  together  with  the  conditions 
that  A k and  Avk  commute  for  p ^ r (as  they  must  do  since  they 
refer  to  ditTerent  degrees  of  freedom),  combine  to  form  the  single 


tensor  equation  A „k]  - sk/4w*fc0.  < 1 2> 

VVe  get,  in  this  way  the  P.B.  relations  for  all  the  dynamical  variables. 
Equation  (12)  can  be  extended  to 

! -1Mk,  A ,k-]  = «k  8kk-/4w*  A*o  • ^ (l'5) 

Let  us  now  return  to  continuous  k-values.  To  convert  8kk.  to  con- 
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tinuous  k-values  we  note  that,  for  a general  function  /(k)  in  three- 
dimensional  k-space, 

2/(k)V  =/(k')  = J/(  k)8(k-k')d»t,  (14) 

where  8(k— k')  is  the  three-dimensional  S function 

S(k-k')  = 

In  order  that  (14)  may  conform  to  the  standard  formula  connecting 
sums  and  integrals,  equation  (52)  of  § 62,  we  must  have 

skSkk.  = S(k-k').  (15) 

Thus  (13)  goes  over  to 

[A^A*]  = o.S(k-^k').  (16) 

This  equation,  together  with  the  equations  v. 

[A^,A*]  = [A^,Avk.]  = 0,  (17) 

provide  the  P.B.  relations  in  the  theory  with  continuous  k-values. 
It  should  be  noted  that  these  P.B.  relations  remain  valid  if  we  replace 
Ap k,  J„k  by  d;:k,  The  same  P.B.  relations  apply  to  the  constant 
Fourier  coefficients  H£k,  H‘,;k. 

We  must  now  obtain  a Hamiltonian  which  makes  each  dynamical 
variable  d/lk  vary  with  the  time  t — x 0 in  the  Heisenberg  picture 
according  to  the  law  (9)  with  HJ)k  constant.  Calling  this  Hamiltonian 
lfF,  we  require 

[H;ik,  HF]  = dAtJdx0  = ikoA^.  (18) 

It  is  easily  seen  that  this  is  satisfied  by 

Hf  = -Air*  J k0K4^A\  d*k.  (19) 

We  therefore  take  (19),  with  the  possible  addition  of  an  arbitrary 
numerical  term  not  involving  any  dynamical  variables,  as  the  Hamil- 
tonian for  the  electromagnetic  field  in  the  absence  of  matter. 

In  § 63  we  used  our  knowledge  of  the  transverse  part  of  the  Hamil- 
tonian to  obtain  the  P.B.s  of  the.  transverse  variables.  We  have  now 
applied  the  reverse  procedure  to  the  longitudinal  variables,  using  our 
knowledge  of  their  P.B.s,  obtained  by  a relativistic  argument,  to 
find  the  part  of  the  Hamiltonian  that  refers  to  them  so  as  to  get 
agreement  with  (18). 

If  we  write  out  the  Hamiltonian  (19)  it  appears  as 

HF  = 4tt2  | k02(A  lk  Hlk-f  H2k  H£k  4-  H3k  H 3k  — Hok  H ok)  d3k. 
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The  first  three  terms  of  the  integrand  here  have  a transverse  part 
which  is  just  equal  to  the  transverse  energy  given  by  (71)  of  § 63. 
The  last  term  of  the  integrand,  which  is  the  part  of  HF  referring  to 
the  scalar  potential  A0,  appears  with  a minus  sign.  This  minus  sign 
is  demanded  by  relativity  and  means  that  the  dynamical  system 
formed  by  the  variables  A0k,  Aok  is  a harmonic  oscillator  of  negative 
energy.  It  is  rather  surprising  that  such  an  unphysical  idea  as  negative 
energy  should  appear  in  the  theory  in  this  way.  We  shall  see  in  § 77 
that  the  negative  energy  associated  with  the  degrees  of  freedom 
connected  with  An  is  always  compensated  by  the  positive  energy 
associated  with  the  other  longitudinal  degrees  of  freedom,  so  that 
it  never  shows  up  in  practice. 

75.  Relativistic  form  of  the  quantum  conditions 

The  theory  of  the  preceding  section  has  relativistic  field  equations, 
namely  equations  (1).  To  establish  that  the  theory  is  fully  relativistic 
we  must  show  further  that  the  P.B.  relations  are  relativistic.  This  is 
not  at  all  evident  from  the  form  (16)  in  which  they  are  written  in 
terms  of  Fourier  components.  We  shall  obtain  a relativistic  form  for 
the  P.B.s  by  working  out  [A^x),  Av(x')]  with  * and  x'  any  two  points 
in  space-time.  We  must  first,  however,  study  a certain  invariant 
singular  function  that  exists  in  space-time. 

The  function  8 (x^xt1)  is  evidently  Lorentz  invariant.  It  vanishes 
everywhere  except  on  the  light-cone  with  the  origin  as  vertex,  i.e.  the 
three-dimensional  space  x^x*  = 0.  This  light-cone  consists  of  two 
distinct  parts,  a future  part,  for  which  x0  > 0,  and  a past  part,  for  which 
x0  < 0.  The  function  which  equals  S(x^x^)  on  the  future  part  of  the 
light-cone  and  —8(x^x^)  on  the  past  part  of  the  light-cone  is  also 
Lorentz  invariant.  This  function,  equal  to  8^  x*)z0/\x0\,  plays 
an  important  role  in  the  dynamical  theory  of  fields,  so  we  introduce 
a special  notation  for  it.  We  define 

A(x)  = 28(xllar)xJ\x0\.  (20) 

This  definition  gives  a meaning  to  the  function  A applied  to  any 
4-vector.  With  the  help  of  (9)  of  § 15,  we  can  express  S(x^x^)  in  the 

Hxp#1)  = i|x|_1{S(^0 — |x|)+8(a:0+|x|)},  (21) 

|x|  being  the  length  of  the  three-dimensional  part  of  x^,  and  then 
A(x)  takes  the  form 

A(x)  = |x|  1{S(a;„  jx|) — S(a;0-|-  |x|)}. 


(22) 
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A(a?)  is  defined  to  have  the  value  zero  at  the  origin,  and  evidently 
A(— x)  = 

Let  us  make  a Fourier  analysis  of  A(x).  Using  d*x  to  denote 
dx^ dxj dx.j, dx3  and  dax  to  denote  dx1dx2dx3  we  have,  for  any  4- 
vector  k , 

f A (x)eik  xdix  = J j x ! “l{8(x0—  | x 1 ) — 8(a;0+ | x | )}e«*»*o-<kx)]  d4x 
= | | x | — e-itoixtje-iOut) 

By  introducing  polar  coordinates  |x|,  6,  <f>  in  the  three-dimensional 
xi  x2  x3  space,  with  the  direction  of  the  three-dimensional  part  of 
as  pole,  we  get 

[ A (x)eik-*d*x  = jjj  a*  » | x |sin  0 d0d<fxl \ x | 

00  7T 

= 2 7r  J {e*fc.:*!_e-»*.i*i}d|x|  j*  e-*'ikll*lcosfl|x|sin0  dd 

0 0 

oo 

= 2w»|k|“1  J {e^xi—g-iAroixi}  rf|x|  {e-l'!kl!*i— 

o 

00 

= 27ri|k|-1  f {e«*»-lki)“~e«*„+iki)a|  da 

— CO 

= 4B*i|k|-*{8(*o-|k|)-8(t0+|k|)} 

= 4w2*A  (k).  (23) 

Thus  the  Fourier  analysis  gives  the  same  function  again,  with  the 
coefficient  47t2L  Interchanging  k and  x in  (23),  we  get 

A(x)  = — ij4rr2 . j A(k)elk-X  d*’k.  (24) 

Some  of  the  important  properties  of  A(.r)  can  easily  be  deduced 
from  its  Fourier  resolution.  In  the  first  place  equation  (24)  shows  that 
A(x)  can  be  resolved  into  waves  all  travelling  with  the  velocity  of 
light.  To  get  an  equation  for  this  result  we  apply  the  operator  □ to 
both  sides  of  (24),  thus 

□A(ar)  = — i'/Itt2.  f A(k)[Jeik-r  d*k  = i/4-rr2.  J'  k^  k^A(k)tikx  d'k. 
Now  k^  k^A(k)  = 0,  and  hence 

□A(ar)  = 0.  (25) 

This  equation  holds  throughout  space-time.  We  can  give  a meaning 
to  QA(x)  at  a point  where  A(x)  is  singular  by  taking  the  integral 


282  QUANTUM  ELECTRODYNAMICS  § 75 

of  DM*)  over-  a small  four-dimensional  space  surrounding  the  point 
and  transforming  it’to  a three-dimensional  surface  integral  by  Gauss  s 
theorem.  Equation  (25)  informs  us  that  the  three-dimensional  surface 
integral  always  vanishes. 

The  function  AU)  vanishes  all  over  the  three-dimensional  surface 
.(■„  0.  Let  US  determine  the  value  of  aA{x)/aar0  on  this  surface.  Tt 

evidently  vanishes  everywhere  except  at  the  point  .q  .iz  --  0, 
where  it  has  a singularity  which  can  bo  evaluated  as  follows,  differ- 
entiating both  sides  of  (24)  with  respect  to  x,„  we  get 

i\{x)  1 f rr- . | G,  A'/.  dM: 

I irr2 . I A-„.k,  ’{8(Ao-  ik:)-b(A-,ct-  .lu.'jr  d4k 

1/4-:-.  j >S{!;0  :ki)  J-ot.G,  ! ■ .'7., 

Putting  x0  0 on  both  sides  here,  we  get 

| < At. >’).(■  iy,j j..,  ,)  -•  i'4~-.  i ( A •,)-  b;)-  8(x0-t  b )jf  <•  v 

! ;iK- 

- : !,t  )h(.»:2)8i  i-'-stN;.  '-‘P 

Thus  ; tic  ordinary  S singuiarily,  with  the  coefficient  -t~.  appeal  a at 

the  polm  r,  ,r\.  e,  • 0.  ' 

Lei  ns  now  evaluate  [ ij  n.  A ,(*')]  We  have  from  (o',  (Hi),  and  0 0 

[A.p: ).  .4,(.r')J 

j j i.4/ik  4 T„.;  i ' A ik  ^ s-i  -I.,  '-""I  MW 

h/in  4 , M »{<■-  -tt'-V*"/  Slk— k')  dWk’ 

■-  c.V  A*-  j'  A'„  '{-•  r :'4  **•  (27) 

The  here  is  delined  to  be  equal  to  Ik;  and  is  thus  always  positive. 
By  putting  - k for  k in  the  second  part-  of  the  integrand,  one  finds 
that  (27)  is  equal  to  the  four-dimensional  integral 

/</,,,  DO.  j k »{&(*•„-  k S 8 (/.-,,  k P,  - 

| A(A-)c-i4^-  • d4k, 

in  which  Ay  takes  on  all  values,  negative  as  well  as  positive.  Evaluating 
this  with  the  help  of  (24),  we  get  finally 

i.l  f r);  .4  ,(  r ) ! f/Ml  A(.r~.r’i,  f2S) 
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a result  which  shows  that  the  P.B.  relations  are  invariant  under 
Lorentz  transformations.  - 

The  formula  (28)  means  that  the  potentials  at  two  points  in  space- 
time  always  commute  unless  the  line  joining  the  two  points  is  a null 
line,  i.e.  the  track  of  a light-ray.  The  formula  is  consistent  with  the 
field  equations  []d(l(rj  - 0.  because  [']  applied  to  the  right-hand 

side  gi  ves  zero,  from  (25). 


76.  The  dynamical  variables  at  one  time 

As  a basis  for  a theory  with  interaction  we  must  use  the  dynamical 
variables  at  one  time.  The  relationships  between  the  dynamical 
variables  at  one  time  (i.e.  their  l\B.s)  arc  not  effected  by  the  introduc- 
tion of  interaction.  On  the  other  hand  the  relationships  between  the 
dynamical  variables  at  different  times  (comprising  the  field  equations 
as  well  as  Use  P.B.s  of  variables  at  different  times)  are' very  much 
affected  lay  the  interaction.  The  dynamical  variables  at  one  time  form 
a non -relativistic  concept,  but  a very  important  concept  in  Hamil- 
tonian theory. 

For  the  case  of  the  electromagnetic  field  the  independent  dynamical 
variables  at  one  t ime  are  _■!  and  for  all  values  of  x2,  x3  for 

the  given  :r0.  The  higher  time  derivatives  d'^A^jdx^ are  not 

independent.  Let  us  put 

IL  Vl?'-  (29) 


Then  wo  have  A ux,  B x,  with  the  suffix  x denoting  ;t:t,  x2,  x3,  as  the 
dynamical  variables  at,  one  time 

The  Fourier  resolution  of  these  variables  is.  from  (3)  and  (9), 


■U:  • • I i | 


H 


HAy-A^e-^^k) 


(30) 


We  may  reverse  the  Fourier  transformation  and  express  A^+A^y 
a nd  Ay  A t k in  terms  of  AjJiX  and  Bux  respectively.  Thus  A/lk  and 
A k are  determined  by  Aftx,  B^  for  all  x (at  o given  x„).  The  equa- 
tions connecting  J,,k,  A,.u  with  A)iX,  do  not  involve  the  time 
explicitly.  Thus  the  A lk,  A uk  form  an  alternative  set  of  one-time 
dynamical  variables,  on  the  same  footing  as  the  Bfix. 

When  we  work  with  the  variables  A(1X.  Bflx  we  shall  need  to  know 
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their  P.B.  relations.  These  may  be  obtained  either  from  the  Fourier 
expansions  (30)  together  with  (16)  and  (17)  or  from  the  general  P.B. 
relation  (28).  The  latter  gives  the  required  results  more  quickly. 
Putting  x’0  = x0  in  (28),  we  get 

[A^A^ | = 0.  (31) 


Differentiating  (28)  with  respect  to  x0  and  then  putting  xj,  = x0,  we 
get,  with  the  help  of  (26), 

[V-4-]  = ^,S(x-x').  (32) 

Differentiating  (28)  with  respect  to  both  x0  and  x'0  and  then  putting 
Xo  = x0,  we  get  [B^,  Bvx.]  = 0,  (33) 

since  e2A(x)/Sxg  = 0 for  x0  = 0.  Equations  (31),  (32),  and  (33)  give 
all  the  P.B.  relations  between  the  A Mx,  Bftx  variables.  They  show  that, 
apart  from  numerical  coefficients,  the  A^x  can  be  looked  upon  as  a set 
of  dynamical  coordinates  and  the  B^x  as  their  conjugate  momenta, 
there  being  a 8 function  on  the  right-hand  side  of  (32)  instead  of  a 
two-suffix  8 symbol  on  account  of  the  number  of  degrees  of  freedom 
being  a continuous  infinity. 

We  can  decompose  /lrx  into  a transverse  and  a longitudinal  part, 
as  shown  by  equations  (8)  and  (6).  We  can  do  the  same  with  Bn  and 

g6t  B = ^ + ^ (34) 


with  d^djdxr  = 0.  (35) 

From  ( 7 ) with  — k substituted  for  k in  the  second  term  of  the  integrand, 
V = i J ks  V2(^k+^-k)e-<<kx)  d3k.  (36) 

The  corresponding  equation  for  U is,  since  U — 8V/8x0, 

V = 


The  electric  field  is  given  by 


J isV’Msk-^n-k)6  t<kx)  d3k. 

8A0 


(37) 


Br~: 


cxT 

d(A0-\~  kJ) 


Thus 


div£ 


8xr 

VM0 


dxr 

-V*(A0+U). 


(38) 


(39) 


It  is  evident  that  any  longitudinal  variable  commutes  with  any 
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transverse  variable.  Some  useful  P.B.  relations  will  now  be  worked 
out.  We  shall  use  the  notation — for  any  field  function  /x, 


d/x  _ f r 

8xr  Jx  ’ 8x'r  ' Jx'  ' 


(40) 


If  in  (32)  we  put  /*  = r,  v = s and  differentiate  the  equation  with 
respeet  to  xr,  we  get 


lBrxr,^sx-l  = 47rgr„8'(x-x')  = -4t78s(X- x'), 
or,  from  (39),  [div^vl^.j  = 47t8*(x— x').  (41) 

Now  (39)  shows  that  div  6 is  a function  only  of  the  longitudinal 
variables,  so  (41)  gives 

[div  €x,  Vx.a']  = 4-7tSs(x — x')  = — 4fr8s'(x— x'). 

Integrating  with  respect  to  xs,  we  get 

[div£x,Fx.]  = -47rS(x—  x'),  (42) 

there  being  no  constant  of  integration  since  the  field  functions  €x  and 
Vx  are  made  up  of  waves  of  non-zero  wave  length.  From  (42)  and  (39) 
V*[C7X,FX.]  = 47r8(x— x'). 

Integrating  with  the  help  of  formula  (72)  of  § 38,  we  get 

\PM=  -|x-x'|-»,  (43) 

there  being  no  constant  of  integration  or  other  terms  not  vanishing 
at  infinity  on  the  right-hand  side,  because  Ux  and  Vx  are  made  up  of 
waves  of  non-zero  wave  length.  We  have  from  (38)  and  (43) 

[£r*.  KcO  = = -(xr-x;)|x-x'|~3.  (44) 

We  shall  now  obtain  the  Hamiltonian  in  terms  of  the  A...  and 
variables.  We  have  from  the  second  of  equations  (30) 

J IK  (Px 

= - JJJ  k0  ^o(^4Mk— ^4/i_k)(^4flk  —-,4,1-k')e_<<kx>e_tlk  x>  d3kd3k'<Px 
= — 8tt3  JJ  k0k'0{A^k-A^_k)(A\.-A*_k)h(k+k')d3kd3k' 

= 87r3  J K^A^-A^A^-A^)  d3k. 

Similarly,  from  the  first  of  equations  (30), 

/ A^A^/dtx 

= - JJJ  kr  k’r(A^kA-A^_k)(A\'A-Ai1_.k.)e~i(^x)e-i{lL'll)  d3kd3k’d3x 

= 8tt»  / V(^V +A^k)(A»_k+A*k)  d3k. 
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Adding  and  dividing  by  —Hit,  we  get 

f iB'pBr+A/A^tPx 

-~l  j ^VMuk  + dzJi. 

Ibis  is  equal  to  II,,  given  by  (19),  apart  from  an  infinite  numerical 
firm,  i ae  lormnla  ( i 9}  iui  llh  already  involves  an  arbitrary  numerical 

term  so  v. e may  tahe 

(Hr,)  ij  (B^Bn-L-AjA^dtx  (45) 

with  an  arbitrary  numerical  term,  different  from  that  of  (19). 

! a .'  M.sm.ui •ma.rj ; !o)  can.  of  course,  be  used  to  give  the  Heisenberg 

C7  O 

coiiii : mu*  ot  mi:or,  -on  he  arbitrary  numerical  term  in  it  does  not 
ilH  ' ;‘l!"  1 : K:  l fUi  «>'iwly  chert . using  (51).  (52)  and  (38),  that 

lj~L* !<■  '"'i  ~ \Aft-  hy  1 ■"  -bp'  ) 

and  V'^o  - - L / (4(i) 

aghtemg  i th  (Ao  and  { i).  -i  also  gives  the  Schrftdinger  equation  of 


mot!.'  m 


>’■  d P </.<•„  Hy\Py 


' '*  i's‘;  ! ‘■sentii'.g  v,  state  >n  the  Schrbdiriger  picture.  The 

arou.'itrv  niw.oru:.-.;  form  her  • has  tiv.  effect  of  changing  ' P)  by  a 
pssu-.r  h- cl-: , v.eieh  is  not  of  physical  importance. 


U ; 

ran  deeompos* 

the  expression  (45 

) for  Up  into  a 

transverse 

pai  l 

lyr  and  a 'otigit 

iHiinal  pari  liF1 . We  have  from  (34) 

j //,  /I 

, &x  - f t.#r  4-  t/')(-yr+U')  <Px 

I fi-'x 

• f UrUr  tpx. 

since  ; 

vanish  on  account  o 

j 0 

d}‘‘X  — j 

d3x  --=  (! 

trniii  ' 

Similarly  we  have  from  (S> 

/ •*/  V< 

'"tr  | • 

» ' ' 

! d3.r, 

wit  is  i 

no  cross  tonus  \ 

■aniidiing  again.  Thi 

afi  ( 15)  becomes 

^4  - r r 

with 

it^  r .. 

f'rrd‘  j 

4-;/r9)  ./;,x 

(47) 

and 

f-hr  (Sir)  s 

j 4- PH'--* 

s.  Bo--4oM00  d3a 

(48) 
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It  should  be  noted  that  the  term 

(Stt  )—1  | .'rfras4ra  (Px 
in  II FT  can  be  transformed  to 

— (877) _1  | d*x  — — ) .9/r(s//s—.s/srs)  <Px 

- (Htt)'1  f -'•'//)  d:!.e 

( 1 <>-)-’  f (.c/P-  -v:  ){■  ■?;■  •-•'•//)  </V 

--  (877)- 1 j 77’  d:i.r. 

jdjit  tne  magnetic  energy.  Some  lurthei 


so  Ibis  term 
gro  t ions  give 

j Vr*Vr*d*x 

so  (48)  may  be  written 


laiiial  Hite 


j {O,  j'.f.;  ,Pj- 


H. FL 


(Hn)  1 ( {(V-A,,nr  - a 0y  bd «„)(  1 vs-‘ -/>’„):•  rfv 


77.  The  supplementary  conditions 

We  nmst.  row  go  back  to  the  Maxwell  equafimi  (*!h  who-ii  we  hav-* 
iirnored  so  far.  We  cannot  take  this  equation  nu’r  duvet  i;.  mh»  to*- 
quainoni  theory  without  getting  inconsislencj'-i.  ! C*P  -hand  skh. 
of  the  collation  does  not  commute  with  Av(x').  acconiuig  to  t<> 
quantum  conditions  (7s>,  so  this  left-hand  side  cannoi  muihc  '1  h<- 
way  out  of  the  difficulty  was  shown  by  Fermi. t It  consists  in  ad.  ri  ar 
a less  stringent  equation,  namely  the  equation  ■ 

(BA^dx^lP}  - "• 

and  assuming  it  to  hold  for  any  \P/  corresponding  to  a soue  Out  ei< . 
actually  occur  in  nature.  There  is  one  equation  p>o)  »or  eaeu  pom* 
in  space-time  and  these  equations  must  ah  hold  tor  any  ket  eor’v- 
s ponding  to  a state  that  can  actually  occur. 

We  shall  call  a condition  such  as  (50),  which  a ket  has  to  satisfy  • : 
correspond  to  an  actual  state,  a supplementary  eoniiUion.  Tne  exis- 
tence of  supplementary  condit  ions  in  the  theory  does  not  mean  an  . 
departure  from  or  modification  in  the  general  principles  of  quantum 
mechanics.  The  principle  of  superposition  of  states  and  the  whole  ■? 
the  general  theory  of  states,  dynamical  variables,  and  observation , 
as  given  in  Chapter  II,  apply  also  when  there  are  supplementary 
t Fermi,  Htvieicx  »j  M-rf-ns  J 7i,y.a<v.  -1  {1032),  12i>. 
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conditions,  provided  we  impose  a further  requirement  on  a Unear 
operator  in  order  that  it  may  represent  an  observable.  We  define  a 
inear  operator  to  be  physical  if  it  has  the  property  that,  when  it 
operates  on  any  ket  satisfying  the  supplementary  conditions,  it  pro- 
duces another  ket  satisfying  the  supplementary  conditions.  In  order 
that  a linear  operator  may  represent  an  observable  it  must  evidently 

satisfy  the  requirement  of  being  physical,  in  addition  to  the  require- 
ments  of  §10. 

We  have  already  had  an  example  of  supplementary  conditions  in 
the  theory  of  systems  containing  several  similar  particles.  The  con- 
dition that  only  symmetrical  wave  functions,  or  only  antisymmetrical 
wave  functions,  represent  states  that  can  actually  occur  in  nature  is 
precisely  of  the  same  type  as  condition  (50)  and  is  what  we  are  now 
calling  a supplementary  condition.  In  this  theory  the  requirement 

at  a linear  operator  shall  be  physical  is  that  it  shall  be  symmetrical 
between  the  similar  particles. 


hen  we  introduce  supplementary  conditions  into  our  theory  we 
must  verify  that  they  are  consistent,  i.e.  not  too  restrictive  to  allow 
any  ket  at  all  to  satisfy  them.  If  we  have  more  than  one  supplementary 
condition,  we  can  deduce  further  supplementary  conditions  from  them 
by  taking  P.B.s  of  the  operators  in  them;  thus  if  we  have 


we  can  deduce 


u\py  = o,  v\py  = o, 


(51) 


[U,  F]|P>  = 0,  [U,[U,  F]]|P>  = 0,  (52) 

and  so  on.  To  verify  that  our  supplementary  conditions  are  consistent 
we  mve  to  look  into  all  the  further  supplementary  conditions  obtain- 
able by  this  procedure  to  see  that  they  can  be  satisfied,  which  we  can 
usually  do  by  showing  that  after  a certain  point  the  further  supple- 
mentary conditions  are  all  either  identically  satisfied  or  repetitions 
oi  the  previous  ones. 

We  must  also  verify  that  the  supplementary  conditions  are  in  agree- 
ment with  the  equations  of  motion.  In  the  Heisenberg  picture,  for 
which  the  ket  I Py  m (5 1 ) is  fixed,  we  shall  have  different  supplementary 
conditions  referring  to  different  times  and  they  must  all  be  consistent 
m the  way  discussed  above.  In  the  Schrodinger  picture,  for  which  the! 

ket  |P>  varies  with  the  time  inaccordance  with  Schrodinger ’s  equation 

we  require  that  if  |P>  satisfies  the  supplementary  conditions  initially 
it  satisfies  them  always.  This  means  that  d\Py/dt  must  satisfy  the 
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supplementary  conditions,  or  that  H |P>  must  satisfy  the  supplemen- 
tary conditions,  or  that  H must  be  physical. 

It  is  convenient  when  we  have  a supplementary  condition  U jP>  = 0 
to  write 


U « 0 


(53) 


and  to  call  (53)  a weak  equation,  in  distinction  to  an  ordinary  or  strong 
equation.  A weak  equation  gives  another  weak  equation  if  it  is 
multiplied  by  any  factor  on  the  left,  but  does  not  in  general  give  a valid 
equation  if  it  is  multiplied  by  a factor  on  the  right.  Thus  a weak 
equation  must  not  be  used  in  working  out  P.B.s.  With  this  way  of 
speaking,  the  requirement  (52)  that  the  supplementary  conditions  are 
consistent  becomes  the  requirement  that  the  P.B.s  of  the  operators 
in  the  supplementary  conditions  shall  vanish,  weakly. 

The  condition  for  a dynamioal  variable  £ to  be  physical  is  that,  for 
each  supplementary  condition  U j P)  = 0,  we  have 

U£\P)  ==  0, 


and  hence  [[/,£]  |P>  ==  0. 

Thus  the  condition  is  that  the  P.B.  of  the  dynamical  variable  with 
each  of  the  operators  of  the  supplementary  conditions  shall  vanish 
weakly. 

Let  us  now  return  to  electrodynamics.  We  take  equation  (2)  to  be 
a weak  equation,  so  it  should  be  written 


SAJdx^  « 0.  (54) 

In  the  Heisenberg  picture  we  have  one  of  these  equations  for  each 
point  x.  To  check  their  consistency,  we  take  two  arbitrary  points  x 
and  x'  in  space-time  and  form  the  P.B. 


SA^x)  dA^xj] 
dx' 


8x„ 


02 

dx^f)x[ 


tI>U  x),Av(x’)]. 


Evaluating  it  with  the  help  of  (28),  we  get 


82A(x— x') 
9itv  dxh8x'v 


— DA  (a:— x')  = 0 


from  (25),  so  the  requirements  for  consistency  are  satisfied  strongly. 
As  we  have  verified  that  the  supplementary  conditions  are  consistent 
at  all  times  in  the  Heisenberg  picture,  we  have  verified  that  they  are 
in  agreement  with  the  equations  of  motion. 

Since  equation  (54)  is  only  a weak  equation,  any  of  its  consequences 
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in  t he  ordinary  Max  well  t heory  will  be  valid  in  the  quantum  theory 
only  as  weak  equations.  The  equations 

div\#  = 0,  cMjbt  = — curl£ 

follow  simply  from  the  definitions  of  £ and.#  in  terms  of  the  potentials, 
so  they  are  valid  strongly  in  the  quantum  theory.  The  other  Maxwell 
equations  for  empty  space,  namely 

div  £ ~ 0,  dBj'di  « curl.#,  (55) 

are  weak  equations  in  the  quantum  theory,  because  one  needs  the 
help  of  (54)  as  well  as  (1)  in  deriving  them. 

The  field  quantities  6.  and  M are  components  of  the  antisymmetric 
tensor  PAvlc]xli  — dAiil8xv.  The  P.B.  of  the  tensor  with  the  operator 
of  (54)  at  a general  point  x'  is 

dAv(x)  cAt'-(x)  SAJx'W  i)2A(x— x')  d2A(x—x')  _ Q 

Vx^  8xy  ’ dx'„  J 8x^dx'a  ° dxvdx'a 

It.  follows  that  £ and  are  physical.  The  potentials  A^  are  not 
physical. 

The  supplementary  conditions  affecting  the  dynamical  variables  at 
a particular  time  are 

C'An  ~ 0.  -----  cAiL  ~ 0.  (56) 

0*o  8xa 

Higher  differentiations  with  respect  to  x0  do  not  give  independent 
equations,  but  equations  which  are  consequences  of  these  and  the 
strong  equation  (J).  Thus  in  terms  of  the  Schrbdinger  variables  of 
§ 76,  the  supplementary  conditions  are 

B0+A/  « 0 (57) 

and  (A0r+Br)r « 0.  (58) 

Equation  (58)  is  the  same  as  the  first  of  equations  (55)  and  may  also 
be  written,  from  (85)), 

V*(M0+f/)  » 0. 

Since  this  holds  throughout  three-dimensional  space,  it  leads  to 

,40+l7«0.  (59) 

Noting  that  A / ---  Vrr,  we  can  now  see  from  (49)  that 

UtL  « 0.  (60) 

Thus  there  is  no  longit  udinal  field  energy  for  states  that  occur  in  nature. 
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To  set  up  a convenient  representation,  we  introduce  a standard  ket 
jOjf>  satisfying  the  supplementary  conditions 

(B0+4/)|0,>  = 0,  (A0+U)\QFy  = 0,  (61) 

and  also  satisfying  a^k|0F>  = 0-  (62) 

These  conditions  are  consistent,  because  .^Tk  commutes  with  the 
operators  in  (61),  and  they  are  sufficient  to  fix  |0F>  completely,  apart 
from  a numerical  factor,  because  the  only  independent  dynamical 
variables  that  we  have  are  A0,  B0,  U,  A /,  s/rk,  srfrk,  and  of  these 
A0+U,  B0JrArr,  ,5?rk  form  a complete  commuting  set.  With  this 
standard  ket  we  can  express  any  ket  as 

T(^0,B0,.<)!<V>.  . (63) 

Our  representation  is  just  the  Fock  representation  so  far  as  concerns 
the  transverse  dynamical  variables  s/rk , J7rk,  so  lF  must  be  a power 
series  in  the  variables  .a?rk,  with  different  terms  in  the  series  corre- 
sponding to  the  presence  of  different  numbers  of  photons.  The  number 
of  variables  occurring  in  *F  is  a continuous  infinity,  so  'F  is  what 
mathematicians  call  a ‘functional’. 

If  the  ket  (63)  satisfies  the  supplementary  conditions,  'F  must  be 
independent  of  A0  and  B0,  and  thus  a function  only  of  the  .s/rk.  So 
physical  states  are  represented  by  kets  of  the  form 

n^rk)!0*,>,  (64) 

with  T"  a power  series  in  the  variables  p/rk.  The  standard  ket  10,.,) 
itself  represents  the  physical  state  with  no  photons  present  , the  perfect 
vacuum. 

Our  Hamiltonian  ffF  and  its  parts  HFL,  Hhr  have  so  far  contained 
arbitrary  numerical  terms.  It  is  convenient  to  choose  these  terms  so 
that  HFL,  Hft  are  zero  for  the  perfect  vacuum.  The  result  (60)  shows 
that  Hfl  given  by  (48)  or  (49)  has  the  numerical  term  in  it  correctly 
chosen  to  make  IIFL  have  the  value  zero  for  the  perfect  vacuum,  as 
well  as  for  every  other  physical  state.  We  must  take  IIFT  to  be 

Hft  = 4tt2  | kj .^.k  sJrk  <Pk,  (65) 

the  transverse  part  of  (19),  in  order  that  the  numerical  term  in  it  may 
be  correctly  chosen  to  give  no  zero-point  energy  for  the  photons, 
(47)  differs  from  (65)  by  an  infinite  numerical  term,  consisting  of  a 
half-quantum  of  energy  for  each  photon  state. 
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78.  Electrons  and  positrons  by  themselves 

We  now  consider^  electrons  and  positrons  in  the  absence  of  electro- 
magnetic field.  The  state  of  an  electron  is  described,  as  in  Chapter  XI, 
by  a wave  function  ifi  with  four  components  ifia  (a  = 1,  2,  3,  4),  satis- 
fying the  wave  equation 

lklx0  = (66) 

To  get  a many -electron  theory  we  shall  apply  the  method  of  second 
quantizat  ion  of  § 65,  which  involves  changing  the  one-electron  wave 
function  into  a set  of  operators  satisfying  certain  anti  com  mutation 
relations. 

When  we  are  dealing  with  </<  at  various  places  at  a given  time  we 
may  write  it  t/ix,  witli  x denoting  xx,  x2,  x3.  Its  components  are  then 
ihllx.  We  pass  to  the  momentum  representation  with  the  wave  function 
t-p  hy  a three-dimensional  Fourier  resolution 

<px  = f ei(xp )M<Ap  d3p,  i//p  = h-  i J d3x.  (67) 

i/‘p  has  four  components  if>ap,  corresponding  to  the  four  components 
of  > px.  In  this  representation  the  energy  operator  is 


Po  = + 

in  which  the  momentum  operators  pr  are  multiplying  factors. 

W e can  separate  >f>  into  a positive-energy  part  £ and  a negative- 
energy  part  £,  ' 

? an<^  £ each  having  four  components  like  <p.  In  the  momentum 
representation  they  are  given  by 


« / 

(P2+m2)}  rv 


since  these  equations  lead  to 


1 

2 


1 


(XrPr+CXmm} 

( p2-f-?W.^)4  j 


<AP>  (68) 


p 0?P  — ( *rPr  + *mm)£ p = &{“r  Pr  + »»  + (p2  + W2)*}(/rp 
= (p24~?«2)»£p, 

and  similarly  p()  £p  = -(p*+ma)*£p> 

showing  that  £p  and  Cp  are  eigenfunctions  of  p0  with  the  eigenvalues 
(P  4 -wi")!  and  (p-  + m2)!  respectively.  When  one  is  working  with 
the  operators 


111 

| °‘rPr  + 0‘m'm\ 

l(l 

arPr-famTOl 

2( 

1 (p2+m2)*  )’ 

2( 

(p24-m2)*  j 
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one  should  note  that  their  squares  are  equal  to  themselves  and  their 
product  in  either  order  is  zero. 

The  second  quantization  makes  the  i/i’s  into  operators  like  the  ij’s 
of§  65,  satisfying  anticommutation  relations  like  (IT)  of  § 65.  Using 
the  notation  for  the  anticommutator 


MN+NM  = [il/,iV]+, 

(69) 

we  get 

Wav  ^.*']+  = 0,  [> hx>  &*']+  = 0- 

WwAhxU  = 8„68(x— x'), 

} (70) 

the  function  8(x— x')  appearing  in  the  last  equation  owing  to  the  x’s 
taking  on  continuous  ranges  of  values.  On  transforming  to  the  p- 
representation  according  to  (67),  we  get 


[0flp>  *A/>p']+  0,  [*/*flp>  &>p']+  0,  1 1711 

C^p.4p-]+  = 3„6  8(P-P').  i 

With  £ and  £ defined  again  by  (68),  the  last  of  equations  (71)  gives 


r t £ i _ Mi  , j n If,, 

1WWJ+  2\  + (p2+m2)i  Le  p’^'/pJ+2(  + (P'2+w2)*}, 

2 i + (p2+w2)*  L 


and  similarly 


= S(p-p') 


db 

(72) 

(73) 


ab 


and 


[^up>^6p'J+  [^ap>  ^6p'l+  0. 


According  to  the  interpretation  of  § 65,  the  operators  </>ap  are 
operators  of  annihilation  of  an  electron  of  momentum  p and  the 
operators  are  operators  of  creation  of  an  electron  of  momentum  p. 

To  avoid  the  unphysical  notion  of  negative-energy  electrons,  we  must 
pass  over  to  a new  interpretation  based  on  the  positron  theory  of  § 73. 
The  annihilation  of  a negative-energy  electron  is  to  be  understood  as 
the  creation  of  a hole  in  the  sea  of  negative-energy  electrons,  or  the 
creation  of  a positron.  So  the  operators  £np  become  operators  of 
creation  of  a positron.  The  positron  has  the  momentum  — p,  because 
an  amount  p of  momentum  gets  annihilated.  Similarly  the  £ap  become 
operators  of  annihilation  of  a positron  of  momentum  — p.  The  £ap 
and  £ap  are  operators  of  annihilation  and  creation  respectively  of  an 
ordinary,  positive-energy  electron  of  momentum  p. 


3595 .67 
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It  should  be  noted  that,  although  £p  has  four  components,  only 
two  of  them  are  independent,  because  the  four  are  connected  by 


f 1 <*rPr+ <xm™\r  0 

I (p2+m2)*  J5* 


which  involves  two  independent  equations.  The  two  independent 
components  of  £p  correspond  to  the  annihilation  of  an  electron  in 


each  of  the  two  independent  states  of  spin.  Similarly  £ has  only 
two  independent  components,  because  of  the  equations 


H- 


(p  2-f  m2)* 


tP  = o, 


and  they  correspond  to  the  creation  of  a positron  in  the  two  inde- 
pendent states  of  spin. 

The  vacuum  state,  for  which  there  are  no  electrons  or  positrons 
present,  is  represented  by  the  ket  |0P)  satisfying 


£ap!Qp)  — h,  t(iP\0p)  - 0.  (74) 

We  can  use  this  ket  as  the  standard  ket  of  a representation.  We  then 
have  any  ket  expressed  as 


'I  iop}|dP), 

in  which  the  function,  or  rather  functional,  '}*  is  a power  series  in  the 
variables  |ap,  £ap.  Each  term  of  T'  is  like  (17')  oi  § 65.  It  must  not 
contain  any  of  its  variables  to  a higher  power  than  the  first.  It  corre- 
sponds to  the  existence  of  certain  (positive-energy)  electrons  and 
certain  positrons,  in  states  specified  by  the  labels  of  the  variables 
appearing  in  it. 

From  (12')  ol  § 65,  the  total  number  of  electrons  is  f J>  d<  dsv 
summed  over  a.  W e may  write  it  in  the  notation  of  equation  (12)  of 

§ 67  as  dap.  Transforming  it  to  the  x-representation  by  (67), 

we  get 

A-3  JJJ  e«m*e-«*pywx  ^ d*zcPx'd*p  = j fcdxdh:, 

showing  that  the  density  of  the  electrons  is  $ tj>x.  This  result  includes 
an  infinite  constant  representing  the  density  of  the  sea  of  negative- 
energy  electrons. 

We  get  a quantity  oi  more  physical  significance  if  we  take  the  total 
charge  Q,  equal  to  the  number  of  positive-energy  electrons  minus  the 
number  of  holes  or  positrons,  all  multiplied  by  — e.  Thus 

e = -ef  (iUp-vvzv)d*P. 


(75) 
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We  can  evaluate  this  with  the  help  of  (68).  Using  the  transpose  of 
the  second  of  these  equations,  namely 


we  get 

{«i(i 


& 


<xlpT+ac]nm\ 

(p2-J-m2)t  j 


°‘rPr  + oimm 


(p2-]-??!2)* 


<4Pr+<*Lm\  T \ 
(p2+ra2)s 


d3p. 


Now  for  any  matrix  a whose  diagonal  sum  is  zero,  the  anticommutation 
relations  (71)  give 

$>“)Ap'  + </'p'at,Ap  = <*ab('l'ap'Pbp+'lJbp''llap)  = >*aa8(P  — P')  = °>  (76) 

a result  which  we  may  assume  still  holds  fpr  p'  = p.  Then  the 
expression  for  Q reduces  to  v. 

Q=-ej  KPp'Pp-tM&p. 

Transforming  it  to  the  x-representation  as  before,  we  get 

Q=-ej 

showing  that  the  charge  density  is 

<77) 


The  interpretation  of  the  one-electron  wave  function  in  § 68  gives, 
besides  the  probability  density  frifi,  a probability  current  ftar  </>. 
With  second  quantization  we  shall  have  correspondingly  a flow  of 
electrons,  given  by  the  operator  </>£  <xr  <px.  The  sea  of  negative-energy 
electrons  produces  no  resultant  flow  of  electrons,  from  symmetry, 
and  so  the  electric  current  is 


jrx  = — eiAx«r'/'x-  (78) 

The  total  energy  of  the  electrons  is,  from  formula  (29)  of  § 60,  which 
is  valid  also  for  fermions, 

HP,  = J fipP0tp  d3p  = J Pp(*rpr+<xmm)ifjp  d3p.  (79) 

It  becomes,  when  transformed  to  the  x-representation, 

HP.  = J <J<1( — ihar </>xr+  am  nufij  d3x.  (80) 

This  total  energy  contains  an  infinite  numerical  term  representing 
the  energy  of  the  sea  of  negative-energy  electrons. 

We  get  a quantity  of  more  physical  significance  if  we  take  the  energy 
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of  all  the  electrons  and  positrons,  reckoning  the  energy  of  the  vacuum 
as  zero.  This  quantity  is 


HP  = j (P2+m2)t(fUP+a£p)  d3p 


(81) 


+ 


f }Wp(ar  Pr+  «m  m)4‘p  — 4,K°‘l  2>r  + <*m  m)fp}  d3p+ 

+ / (p ■+ *»*)**(&  d3p-  <82> 


From  (76),  the  first  integral  in  (82)  is  the  same  as  (79)  and  is  just 
Hf,:  The  second  integral  is  an  infinite  constant  and  is  minus  the 
energy  of  all  the  negative -energy  electrons  of  the  vacuum  distribution. 

We  may  take  either  HP  or  HP.  as  the  Hamiltonian.  The  Heisenberg 
equation  of  motion  for  iftax  is  thus 


‘d4>aJdx 0 = Wax’  HP]  = Wax’  Hp  ], 

and  if  we  work  this  out  we  just  get  back  to  the  wave  equation  for  </>, 
namely  (66). 

We  must  now  look  into  the  question  of  whether  the  theory  is 
relativistic.  It  is  built  up  from  operators  <p  which  satisfy  the  field 
equations  (66).  These  equations  are  the  same  as  the  wave  equation 
for  the  one-electron  wave  function  and  are  known  to  be  invariant 
under  Lorentz  transformations,  provided  ifi  transforms  according  to 
the  law  (20)  of  Chapter  XI.  Our  present  theory  goes  beyond  the 
one-electron  theory  in  that  anticommutation  relations  are  introduced 
for  the  i/r’s  and  and  it  becomes  necessary  to  verify  that  these 
anticommutation  relations  are  Lorentz  invariant. 

We  proceed  by  a method  analogous  to  that  of  § 75.  We  take  two 
general  points  x and'*'  in  space-time  and  form  the  anticommutator 


Kab(x’x')  = ^«(X)&,(X')  + ^(X0<A«(*)-  (83) 


We  can  evaluate  it  by  working  directly  from  the  anticommutation 
relations  (71)  for  the  Fourier  components  of  ^ and  </>.  A simpler  way 
is  to  note  certain  properties  that  K^x,  x’)  must  have,  namely 

(i)  it  involves  and  x'^  only  through  their  difference  x^—x'^; 
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(ii)  it  satisfies  the  wave  equation 

(ihlL0+ih<Xri~  a’»m)abK'*(x’x')  = 0 (84) 

on  account  of  ip(x)  satisfying  (66); 

(iii)  for  x0  — x'0  it  has  the  value  Sa68(x— x'),  as  follows  from  the 
third  of  equations  (70). 

These  properties  are  sufficient  to  fix  Kab{x,  x')  completely,  since  (iii) 
fixes  it  for  x0  = x'0,  (ii)  shows  how  it  depends  on  x0,  and  (i)  then  shows 
how  it  depends  on  x'0.  The  solution  is  easily  seen  to  be 

Kab{x,x')  = h-3  J J ${l+(<xTP,+  <*nlm)IPo}abe~i(x~x')plh  d3P>  (85) 

where  the  2 means  a summation  over  the  two  Values  i(pa+m2)*  for 
p0  with  particular  values  for  pv  p2,  p3.  It  satisfies  (ii)  since  the  operator 
in  (84)  produces  the  factor  ( p0—<xTpT—atmm ) in  the  integrand  of 
(85),  which  factor  gives  zero  when  multiplied  on  the  left  into  the 
factor  {}.  It  satisfies  (iii)  since,  with  x0  = x0,  the  summation  over  p0 
makes  the  second  term  in  { } cancel  out. 

The  law  of  transformation  for  >p  and  i p given  in  § 68  has  the  effect 
of  making  the  quantities  $*(«') aM^(x)  transform  like  the  four  com- 
ponents of  a 4-v-ector  and  making  tjj*(x')amtp(x)  invariant.  Thus 

1r-^{x')alltfi(x)-{-Stlj^(x')aim>p(x)  (86) 

is  invariant  with  l>*  any  4-vector  and  S any  scalar.  The  invariance 
of  (86)  must  be  sufficient  to  ensure  the  correct  transformation  law 
for  >p  and  since  it  enables  one  to  deduce  the  invariance  of  the  wave 
equation  for  </>,  by  taking  = ihdjdx S = — m. 

The  invariance  of  (86)  leads  to  the  invariance  of 

+ )MX)  + <Pb(x)‘J>a(x'  )}• 

Thus  (**«„ +S<xm)abKbn(x,x ')  (87) 

should  be  invariant  with  Kah(x,x')  given  by  (85),  and  its  invariance 
would  be  sufficient  to  ensure  the  invariance  of  the  anticommutation 
relations.  We  get  for  (87) 

h~ 3 / X i(^ai/x  + *8'“m)«zi(Po  + “r?r+  m)ba  e^r-x'^hp^1  d3p 

= h~3  / 1 Wl0~ls  ^8  + S<xrn)(Po  + ,xrPr  + xmm)}aae^i(X~X'>'mPo1  ®P 

= j ^ 2(l0 p0 — lrpr-h  Sm)e~’u'x~x>-plhpQ  1 d3p.  (88) 

This  is  Lorentz  invariant  because  the  differential  element  p^1  d3p  is 
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Lorentz  invariant.  Thus  the  relativistic  invariance  of  the  theory  is 
proved. 

79.  The  interaction 

The  complete  Hamiltonian  for  electrons  and  positrons  interacting 
with  the  electromagnetic  field  is 

H — II F h Up -\-IIq,  (89) 

where  HF  is  the  Hamiltonian  for  the  electromagnetic  field  alone, 
given  by  (19)  or  (45),  HP  is  the  Hamiltonian  for  the  electrons  and 
positrons  alone,  given  by  (80)  or  (81),  and  HQ  is  the  interaction  energy, 
involving  the  dynamical  variables  of  the  electrons  and  positrons  as 
well  as  those  of  the  electromagnetic  field.  We  take 

Hq  = J Arjp  d3x,  (90) 

with  j given  by  (77)  and  (78),  as  we  shall  see  that  this  gives  the  correct 
equations  of  motion.  Thus,  with  neglect  of  infinite  numerical  terms, 

H = J {i^tcxr( — ihifir—eAr>ji)-\-ii^oimmAfi — d3x  — 

— (Stt)-1!  {Bp  Br+A/A^)  d3x.  (91) 

Let  us  work  out  the  Heisenberg  equations  of  motion  that  follow 
from  the  Hamiltonian  (91).  We  have 

ill  »ifi,J8xa  = ipaxH—fhfiax  = tjiax(HP+HQ)  — (HP+HQ)>pax 

= / l‘P„x><i>bx']+{<Xr(— iHj'— eHrx-«/v)  + 

+ «m  in4x — eA°x-  <f>x-}b  dzx' 

= {*r{-ihif,xr—eArx  i/jx)+ammifix—eA0x  ifrx}a. 

Thus  + eA^~  = 0.  (92) 

This  agrees  with  the  one-electron  wave  equation  (11)  of  Chapter  XI. 
Since  H is  real,  the  equation  of  motion  for  i p will  be  the  conjugate  of 
the  equation  of  motion  for  i p and  so  will  agree  with  (12)  of  Chapter  XI. 
Thus  the  interaction  (90)  gives  correctly  the  action  of  the  field  on  the 
electrons  and  positrons.  Further  we  have,  making  use  of  the  P.B. 
•relations  in  (46), 


8Apl8x0  - [Ap,  H]  = [Ap,  Hp\ 
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and  SB^JdXg  — [B^H]  = I B^x,  IlF]-  {Bflx,  II Q\ 

= (94) 

(93)  and  (94)  lead  to  = 477 (95) 

which  agrees  with  the  Maxwell  theory  and  shows  that  the  interaction 
(90)  gives  correctly  the  action  of  the  electrons  and  positrons  on  the 
field. 

To  complete  the  theory  we  must  bring  in  the  supplementary  con- 
ditions (54).  We  must  verify  that  they  are  in.  agreement  with  the 
equations  of  motion.  The  method  used  in  §J77,  which  consisted  in 
showing  that  the  supplementary  conditions  at  different  times  in  the 
Heisenberg  picture  are  consistent  with  one  another,  is  no  longer 
applicable,  because  the  quantum  conditions  connecting  dynamical 
variables  at  different  times  get  altered  by  the  interaction  in  a way 
that  is  too  complicated  to  be  worked  out.  So  we  shall  obtain  all  the 
supplementary  conditions  affecting  the  dynamical  variables  at  one 
instant  of  time  and  check  whether  they  are  consistent. 

We  have  again  equations  (56).  A further  differentiation  with  respect 
to  x0  gives  _ □ 8AJ8x^  « 0.  (96) 

Now  the  equation  of  motion  for  ip,  namely  (92),  leads,  as  in  §68,  to 

= °> 

This  is  the  same  as  dj^dx^  = 0,  (97) 

because  the  difference  between  —et ptf'  and  j0  is  constant  in  time,  even 
though  it  is  infinite.  From  (95)  we  now  see  that  (96)  holds  as  a strong 
equation.  Thus  equations  (56)  are  the  only  independent  supplemen- 
tary conditions  affecting  the  dynamical  variables  at  one  instant  of 
time.  The  first  of  them  gives  (57),  as  before,  and  t he  second  now  gives, 
with  the  help  of  (95)  for  /x  = 0, 


(A0'+Bry +47Tj0ttQ. 

(98) 

This  may  be  written 

(A0+Ur+4nj0K<) 

(99) 

or,  from  (39),  divd?  -477J,,  ~ 0, 

and  is  just  one  of  the  Maxwell  equations. 

(100) 
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One  can  see  without  detailed  calculation  that,  for  any  two  points 
x and  x’  at  the  same  time, 

box-iox']  — o, 

since,  from  the  form  of  (70),  the  P.B.  must  be  a multiple  of  8(x— x') 
and  cannot  contain  derivatives  of  S(x—  x'),  while  also  it  has  to  be  anti- 
symmetrical  between  x and  x'.  Thus  the  extra  terms  4t rj0x  in  equa- 
tions (98)  for  various  values  of  x,  as  compared  with  the  corresponding 
equations  (58),  commute  with  one  another  as  well  as  with  all  the 
other  dynamical  variables  occurring  in  (58)  and  (57).  It  follows  that 
these  extra  terms  will  not  disturb  the  consistency  of  (58)  and  (57), 
and  hence  (98)  and  (57)  are  consistent. 

Our  method  of  introducing  interaction  into  the  theory  was  not 
relativistic,  since  the  interaction  energy  (90)  involves  the  dynamical 
variables  at  an  instant  of  t ime  in  some  Lorentz  frame.  It  therefore 
becomes  questionable  whether  the  theory  with  interaction  is  a rela- 
tivistic one.  Our  field  equations,  namely  (92)  and  (95),  are  evidently 
relativistic  and  so  are  the  supplementary  conditions  (54).  It  remains 
uncertain  whether  the  quantum  conditions  are  Lorentz  invariant. 

We  know  the  quantum  conditions  connecting  all  our  dynamical 
variables  Aflx,  5flx,  ifiax,  <fiux  at  a given  time  t0.  We  cannot,  as  men- 
tioned above,  work  out  the  general  quantum  conditions  connecting 
dynamical  variables  at  any  two  points  in  space-time,  because  the 
interaction  makes  it  too  complicated.  We  shall  therefore  make  an 
infinitesimal  Lorentz  transformation  and  work  out  the  quantum  con- 
ditions at  a given  time  in  the  new  frame  of  reference.  If  we  can  estab- 
lish that  the  quantum  conditions  are  invariant  under  infinitesimal 
Lorentz  transformations,  their  invariance  under  finite  Lorentz  trans- 
formations will  follow. 

Let  x*  be  the  time  coordinate  in  the  new  frame  of  reference.  It  is 
connected  with  the  original  coordinates  by 

x%  = x0+ €Vrxr,  (101) 

where  € is  an  infinitesimal  number  and  vr  is  a three-dimensional  vector, 
tvr  being  the  relative  velocity  of  the  two  frames.  We  shall  neglect 
terms  of  order  e2. 

A field  quantity  k at  the  place  x at  the  time  x*  in  the  new  frame 
has  the  value 

k(x,.tJ)  = k(x,  x0)-\-  (x% — x0)  8kJ8x0  — K(x,x0)+evrxr[Kx,H], 

(102) 
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Its  P.B.  with  another  such  field  quantity  X{x'  ,x*)  is 

[k(x,  x*),  A(x',  .r*)J  - [K(x,x0)+ev/xr[Kx,H),\(x',x0)  + €Vsx's[Xx/!H}] 

= [k(x,  x0),  A(x',  x0)]-f  £rJafs[/<x,  [Ax>,  H]]  + 

+ tvrxr[[xx,H],  Ax.] 

= [/c(x,a:0),A(x',a:0)'J- + etv(4— •*r)K-[\lotf]]+ 

(103) 

If  K and  A arc  «/<  or  $ variables,  we  should  be  interested  in  their  anti- 
commutator instead  of  their  P.B.  Using  the  notation  (till)  for  the 
anticommutator,  we  have 


[«(x,4),A(x',4)]  + 

=-  [k(x,  x0),  A(x', ar0)]  + + €,,ra!r['fx>  [K->  H]}y4-evrxr[[Kx,  H],  Ax]f 
= \k(x,  .r0),  A(x',  ar0)]++ewr(4— IK‘<  H 11++  iVrxr\\.Kv  M 

(104) 

With  k and  A any  two  of  the  basic  variables  Afl,  *A„,  <A,„  the  P.B. 
[*x>  A*]  or  anticommutator  [«x,  Ax]h,  as  the  case  may  be,  is  a number, 
and  so  the  last  term  in  (103)  or  (104)  vanishes.  We  are  left  with 

[k(x,4),A(x',4)]±  = [k(x,.t0),A(x',x0)]±-+- 
-f- evr(x'r — .Tr)[«x,  [Ax.,  H },+ H1,  \}±  + tvr{xr  - xr)lKx,[Xx, , UQ  j] a , (105) 

where  [«,  A]±  denotes  the  P.B.  or  the  anticommutator,  as  the  case  may 
be.  From  the  form  (90)  for  HQ  we  see  that  [Ax  , II Q ] can  involve  only 
the  dynamical  variables  A^,  *Aax-  «A,«-  and  cannot  involve  any  deriva- 
tives of  these  variables.  It  follows  that  |*x,  [Ax,  HQ]]±,  if  it  does  not 
vanish,  will  be  a multiple  of  8(x— x')  and  will  not  contain  terms  with 
derivatives  of  S(x-x').  Hence  the  last  term  of  (105)  vanishes.  We 
can  conclude  that  [k(x,x*).A(x',4)]±  has  the  same  value  as  when 
there  is  no  interaction,  and  is  thus  Lorentz  invariant  from  our  earlier 
work. 

A possible  criticism  of  the  above  proof  should  be  noted.  At  several 
places  we  worked  out  expressions  in  powers  of  € and  neglected  t2. 
Such  a procedure  cannot  be  valid  for  calculating  [«-(x),A(x  )]„  with  x 
and  x'  two  general  points  in  space-time  lying  close  together,  so  that 
x —x  is  of  order  e,  because  the  result  of  the  calculation  should  be 
a^f unction  of  the  {x^-x^'a  having  a singularity  when  the  4-vector 
x— x lies  on  the  light-cone  and  such  a function,  of  ' -urse,  cannot  be 
expanded  as  a power  series  in  the  (xfl  x^)  s. 
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To  validate  the  argument  we  should  reformulate  it  so  as  to  avoid 
the  use  of  the  8 function.  Instead  of  evaluating  [*-(x,  x*),  A(x',  £*)],., 
we  should  evaluate 

[ | «x»f(x,a;J)  d3x,  j bx.X(x',x*t)  <ZV]±,  (106) 

where  ax  and  bx  are  two  arbitrary  continuous  functions  of  xv  xz,  x3. 
Then  the  quantities  that  we  need  to  expand  in  powers  of  e all  vary 
continuously  with  a continuous  change  in  the  direction  of  the  time- 
axis,  and  the  expansions  are  justifiable.  The  equations  that  we  now 
get  are  those  of  the  previous  argument  multiplied  by  axbx.  d3xd3x' 
and  integrated.  We  are  led  to  the  same  conclusion — that  the  P.B. 
or  anticommutator  has  the  same  value  as  when  there  is  no  interaction. 

It  will  be  seen  that  the  reason  why  the  interaction  does  not  disturb 
the  quantum  conditions  is  because  it  is  so  simple,  involving  only  the 
basic  dynamical  variables  and  not  their  derivatives.  The  P.B.s  and 
anticommutators  have  the  same  values  as  with  no  interaction  pro- 
vided they  refer  to  variables  at  two  points  in  space-time  that  are  at 
the  same  time  with  respect  to  some  observer.  This  means  the  two 
points  must  be  outside  each  other’s  light-cones  and  may  approach 
coincidence  only  along  a path  lying  outside  the  light-cone. 

80.  The  physical  variables 

A ket  i P)  that  represents  a physical  state  must  satisfy  the  supple- 
mentary conditions 

(BU+A/)\P)  = 0,  (div£-477j0)|P>  = 0.  (107) 

A dynamical  variable  is  physical  if,  when  multiplied  into  any  ket 
satisfying  these  conditions,  it  gives  another  ket  satisfying  these  con- 
ditions. This  requires  that  it  shall  commute  with  the  quantities 

/>„  4-  A/,  div  €-  It TjQ.  (108) 

Let  us  sec  what  simple  dynamical  variables  have  this  property. 

The  transverse  field  variables  s-Jr  .ftT  evidently  commute  with  the 
quantities  (108)  and  are  physical.  The  variable  </<„  commutes  with 
the  first  of  the  quantities  (108)  but  not  the  second  and  is  thus  not 
physical.  We  have 

’J’hx'-  l h-4[IX  '/W  - 

= 8„ftS(x-x')4x  >pax §(X  x'). 

(loo) 


Thus 


WWoJ  • *>/M«yS(x-x.') 
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From  (42) 

[eievxiht  div  <§  ] _ ^Triejfi . e“Fx/ftS(x — x'). 

Hence 

[eieV*l*+ax,  div  £x.-4ir j0x.]  = [e‘^l\  div  £x-]</>rtx-4ne^[^,x,  j0x,] 

= 0. 

Thus  if  we  put  fi*x  = eieV^ux,  (110) 

1 /'m  commutes  with  both  expressions  (108)  and  is  physical.  Similarly 
’A**  is  physical.  The  variables  s4r,  .S*?r,  tp*n  i Jj*  are  the  only  independent 
physical  variables,  apart  from  the  quantities  (108)  themselves. 

We  have 

io  = -|e(^*V*-<A*V*)>  ir  =.  -*<£**«,</'*•  (Ill) 
Thus  the  charge  density  and  current  are  physical.  Also  it  is  easily 
seen  that  € and  are  physical,  just  as  in  the  case  when  there  are 
no  electrons  and  positrons  present.  All  those  variables  are  physical 
that  are  unaffected  by  the  arbitrariness  that  exists  in  the  electro- 
magnetic potentials  in  the  Maxwell  theory. 

The  operator  < pax  represents  the  creation  of  a positron  or  the 
annihilation  of  an  electron  at  the  place  x.  Let  us  see  what  is  the 
physical  significance  of  the  operator  t f,*x.  From  (44) 

ih[eifVxln,  <?„.]  = ee<«rx/*(xr— 4)Jx— x'|"3, 
and  hence  Sn.]  = efi*x(xr-x'r)  x-x'r3 

or  <SV*-IA**  = <P™{€r* ■+e(x’r— xr)|x'-x|-3}.  (112) 

Take  a state  |P)  for  which  Sr  at  a certain  point  x'  certainly  has  the 
numerical  value  cr,  so  that 

ri„  r e«\P)  = cT\P). 

Then  from  (112) 

^rx'^ixlP)  = {cr+e(x'r— *r)|x'—  x!"3}^*x!P), 
so  for  the  state  fi*x\ P),  €r  at  the  point  x'  certainly  has  the  value 
cr+e{x'r—xr)  |x'  — x|-3. 

This  means  that  the  operator  fi*x,  besides  creating  a positron  or  annihi- 
lating an  electron  at  the  point  x,  increases  the  electric  field  at  the 
point  x'  by  e(x'T— xr)|x'  — xh3,  which  is  just  the  classical  Coulomb 
field  at  x'  of  a positron  with  charge  e at  the  point  x,  Thus  the  operator 
'P*r  creates  a positron  at  the  point  x together  with  its  Coulomb  field, 
or  else  annihilates  an  electron  at  x together  with  its  Coulomb  field. 
For  e'ectrons  and  positrons  interacting  with  the  electromagnetic 


304 


QUANTUM  ELECTRODYNAMICS 


§80 


field  it  is  the  variables  p*,  p*,  rather  than  the  variables  p,  p,  that 
correspond  to  the  physical  processes  of  creation  and  annihilation  of 
electrons  and  positrons,  since  these  processes  must  always  be  accom- 
panied by  the  appropriate  Coulomb  change  in  the  electric  field  around 
the  point  where  the  particle  is  created  or  annihilated.  It  is  easily  seen 
that  the  variables  p*x,  p*x  satisfy  the  same  anticommutation  relations 
(70)  as  the  unstarred  variables.  When  we  pass  to  the  momentum 
representation  the  important  quantities  will  be,  not  the  unphysical 
variables  pp  defined  by  (67),  but  the  physical  variables  pp  defined  by 

PI  = A-«  J ei(xp)l,'tfi*  d3p,  PI  = /H  [ e-Wfip*  d3x.  (113) 
We  must  now  replace  (68)  by 


£ 


* 

p 


1 

2 


1 + 


<XrPr+0Lmm\,* 

(p2+m2)J  rp’ 


<*rPr+<*mm\jL* 
(p2-j-»i2)i  r p’ 


and  take  |£  to  represent  the  annihilation  of  an  electron  of  momentum 
p,  the  creation  of  an  electron  of  momentum  p,  £*  the  creation  of 
a positron  of  momentum  — p and  the  annihilation  of  a positron 
of  momentum  — p.  The  variables  pp,  p*,  |J,  |p,  £*  will  all  satisfy 

the  same  anticommutation  relations  as  the  corresponding  unstarred 
variables. 

We  can  express  the  Hamiltonian  entirely  in  terms  of  physical 
variables.  We  have 


Thus 


p*r  = eieVlh(pr+ie/h.Vrp). 


Hp-VHy  — j"  {ptar[-ifipr—e(j/r—Vr)p]-irftacmmtp-\-Aoj0}  d3x 

= J {p*^(xr(  — ihp*' — es/Tp*)-\-p*^  <xm mp*-\-A°j0}  d3x. 

The  last  term  in  the  integrand  here  should  be  combined  with  HFL. 
From  (49)  and  (57) 

Hfl  ~ J (U—A0)(U+A0yrd3x 

~ i J (U—A0)j0d3x 

with  the  help  of  (99).  Thus 


hfl+  f A°j0  d3x  « J J (U+A0)j0  d3x. 
Integrating  (99)  with  the  help  of  formula  (72)  of  §38,  we  get 


§80 
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Hfl+  J A°j0  d*x  « i J J d3xd*x'. 

Thus, we  get  H X H* 

with 

H*  = J"  {ifi*^oir(  — ih>p*r~es/rifi*)-\-i}i*^ocmmifi*}  d3x-\- 

+ HFt  + \ [J  d?xd?x'.  (114) 

We  may,  use  H*  instead  of  H as  our  Hamiltonian.  It  leads  to  tne 
same  Schrodinger  equation  for  a physical  ket,  since  if  |P>  is  physical 

H*\P>  = H\P). 

Also  it  leads  to  the  same  Heisenberg  equations  of  motion  for  physical 
variables,  since  if  £ is  a physical  variable 

a [£,//]. 

Thus  H*  and  H are  equivalent  Hamiltonians  for  the  physical  quanti- 
ties, and  the  others  do  not  matter. 

H * involves  only  physical  variables.  The  longitudinal  field  variables 
do  not  appear  in  it.  Instead  of  them  we  have  the  last  term  of  (114), 
which  is  just  the  Coulomb  interaction  energy  of  any  charges  that  are 
present.  The  appearance  of  such  a term  in  a relativistic  theory  is 
rather  strange,  as  it  is  an  energy  associated  with  the  instantaneous 
propagation  of  forces.  It  appears  as  a result  of  our  having  transformed 
the  theory  a long  way  from  the  Heisenberg  form  in  which  the  relati- 
vistic invariance  of  the  theory  is  manifest. 

Wq  could  set  up  a representation  by  taking  as  standard  ket  the 
product  of  the  standard  ket  |0F>  for  the  electromagnetic  field  alone, 
given  by  (61)  and  (62),  with  the  standard  ket  |0P>  for  the  electrons 
and  positrons  alone,  given  by  (74).  This  representation  would  not  be 
a convenient  one,  however,  because  its  standard  ket  does  not  satisfy 
the  second  of  the  supplementary  conditions  (107). 

We  get  a more  convenient  representation  if  we  take  another  stan- 
dard ket  |Q)  satisfying 

(B0+A/)\Qy  =0,  (div£-4,rj0)|Q>  = 0,  (115) 

= 0,  £?piQ>  = o,  l*p\Qy  — °-  (116) 

These  conditions  are  consistent,  because  the  operators  on  IQ)  in 
them  all  commute  or  anticommute  with  each  other,  and  there  are 
enough  of  them  to*  fix  |Q)  completely,  apart  from  a numerical  factor, 
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because  there  are  as  many  of  them  as  of  the  conditions  for  <V>  ;0f,). 
The  conditions  (115).show  that  | Qy  satisfies  the  supplementary  con- 
ditions and  so  represents  a physical  state.  The  conditions  (116)  show 
that  | Q ) represents  a state  for  which  there  are  no  photons,  electrons, 
or  positrons  present. 

Any  ket  jP>  that  satisfies  the  supplementary  conditions  (107)  and 
so  represents  a physical  state,  can  be  expressed  as  some  physical 
variable  multiplied  into  | Q).  The  only  independent  physical  vari- 
ables that  give  non -vanishing  results  when  applied  to  \Q)  are  j/rk, 

|P>  = TKk,ap,Cp)!e>-  (117) 


&p,  CP-  Hence 


Thus  | P>  is  represented  by  a wave  functional  'F  involving  the  variables 
■^Kk’  i*P<  Cap-  It  is  a power  series  in  these  variables,  the  vai.ous  terms 
in  it  corresponding  to  the  existence  of  various  numbers  of  photons, 
electrons,  and  positrons,  with  the  Coulomb  fields  around  the  electrons 
and  positrons. 

In  using  the  representation  (117)  together  with  the  Hamiltonian  H*, 
we  have  a form  of  the  theory  in  which  we  can  ignore  the  conditions 
(115),  as  they  have  no  effect  on  the  kets  (117).  We  must  retain  the 
conditions  (116).  The  longitudinal  variables  then  no  longer  appear 
in  the  theory. 


81.  Interpretation 

The  foregoing  work  establishes  the  basic  equations  of  quantum 
electrodynamics.  There  are  two  forms  of  the  theory,  involving  the 
Hamiltonians  H and  H*  respectively.  We  must  now  consider  the 
interpretation  and  application  of  the  theory.  We  shall  take  the  H* 
form  for  definiteness.  The  argument  would  be  essentially  the  same 
with  the  H form. 

The  ket  | Qy  represents  a state  for  which  there  are  no  photons, 
electrons,  or  positrons  present.  One  would  be  inclined  to  suppose  this 
state  to  be  the  perfect  vacuum,  but  it  cannot  be,  because  it  is  not 
stationary.  For  it  to  be  stationary  we  should  need  to  have 

H*\Q > = C\Q} 

with  C a number.  Now  H*  contains  the  terms 

— ej<p* j</ri p*  d3x  -j-  ^ J*  j*  cP%d3x',  (118) 

which  do  not  give  numerical  factors  when  applied  to  | Qy  and  which 
therefore  spoil  the  stationary  character  of  | Qy. 
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Let  us  call  the  state  Q represented  by  Q)  the  no-particle  state  at 
a certain  time.  If  we  start  with  the  no-particle  §tat.e  it  does  not  remain 
the  no-particle  state.  Particles  get  created  where  none  previously 
existed,  their  energy  coming  from  the  interaction  part  of  the  Hamil- 
tonian. 

To  study  this  spontaneous  creation  of  particles,  we  take  the  ket 
|Q)  as  initial  ket  in  the  Schrodinger  picture  and  treat  the  terms  (118) 
as  a perturbation  giving  rise  to  a probability  of  the  state  Q jumping 
into  another  state,  in  accordance  with  the  theory  of  § 44.  The  first  of 
them,  resolved  into  its  Fourier  components,  contains  a part 

~~e(<xr)ab  //  (H9) 

which  causes  transitions  in  which  a photon  is  emitted  and  simul- 
taneously an  electron-positron  pair  is  created.  After  a short  time 
the  transition  probability  is  proportional  to  the  squared  length  of  the 
ket  formed  by  multiplying  (119)  into  the  initial  ket  |Q>,  which  is 

^ (0£r)aft(as)cd  X 

x J/JJ  <0 1 £r+M  a fcV  0V+w  1 d'kcPpdwdy 

= e2(“rUK)cd  ////  J*V|X 

x[&>&-]+[£p+m.  d3kd*pdWdy. 

Using  the  values  of  the  P.B.  and  anticommutators  given  by  (4),  (16), 
(72),  (73),  we  get  an  integrand  which  depends  on  the  k,  k'  variables 
according  to  the  law  |k|_1S(k— k')  for  large  values  of  k and  k'.  This 
gives  an  integral  that  diverges,  so  the  transition  probability  is  infinite. 

The  second  term  of  (118),  resolved  into  its  Fourier  components, 
contains  terms  like  I*  Ip-  £p-  lp+p  -p',  which  cause  transitions  in  which 
two  electron -positron  pairs  are  created  simultaneously.  One  can 
calculate  the  transition  probability  as  before,  and  one  finds  again 
that  it  is  infinite.  From  these  calculations  one  can  conclude  that  the 
state  Q is  not  even  approximately  stationary. 

A theory  which  gives  rise  to  infinite  transition  probabilities  of 
course  cannot  be  correct.  We  can  infer  that  there  is  something  wrong 
with  quantum  electrodynamics.  This  result  need  not  surprise  us, 
because  quantum  electrodynamics  does  not  provide  a complete 
description  of  nature.  We  know  from  experiment  that  there  exist 
other  kinds  of  particles,  which  can  get  created  when  large  amounts  of 
energy  are  available.  All  that  w'e  can  expect  from  a theory  of  quantum  ' 
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electrodynamics  is  that  it  shall  be  valid  for  processes  in  which  there 
is  not  enough  energy  available  for  these  other  particles  to  be  created 
to  an  appreciable  extent,  say  for  energies  up  to  a few  hundred  MeV. 
Thus  the  high-energy  part  of  the  interaction  energy  (118)  is  quite 
unreliable,  and  it  is  this  high-energy  part  that  is  responsible  for  the 
infinities. 

It  appears  that  we  must  modify  the  high-energy  part  of  the  inter- 
action. At  present  there  does  not  exist  any  detailed  theory  of  the  other 
particles  and  so  it  is  not  possible  to  say  how  it  ought  to  be  modified. 
The  best  we  can  do  is  to  cut  it  out  from  the  theory  altogether,  and  so 
remove  the  infinities.  The  precise  form  of  the  cut-off  and  the  energy 
where  it  is  applied  will  be  left  unspecified.  Of  course,  the  cut-off 
spoils  the  relativistic  invariance  of  the  theory.  This  is  a blemish 
which  cannot  be  avoided  in  our  present  state  of  ignorance  of  high- 
energy  processes. 

Even  with  a cut-off  the  no-particle  state  Q is  not  approximately 
stationary.  It  therefore  differs  very  much  from  the  vacuum  state. 
The  vacuum  state  must  contain  many  particles,  which  may  be 
pictured  as  in  a state  of  transient  existence  with  violent  fluctua- 
tions. 

Let  us  introduce  the  ket  | F>  to  represent  the  vacuum  state.  It  is 
the  eigenket  of  //*  belonging  to  the  lowest  eigenvalue.  Here  and  sub- 
sequently H*  denotes  the  expression  (114)  modified  by  the  cut-off. 
One  might  try  to  calculate  |F>  as  a perturbation  of  the  ket  | Q},  but 
such  a method  would  be  of  doubtful  validity,  because  the  difference 
between  IF)  and  |Q>  is  not  small.  No  satisfactory  way  of  calculating 
|F>  is  known.  In  any  case  the  result  would  depend  strongly  on  the 
cut-off,  and  since  the  cut-off  is  unspecified  the  result  would  not  be  a 
definite  one. 

It  follows  that  we  must  develop  the  theory  without  knowing  |F>. 
This  is  not  a great  hardship,  because  we  are  not  mainly  interested  in 
the  vacuum  state.  We  are  mainly  interested  in  states  which  differ 
from  the  vacuum  through  having  a few  particles  present  in  addition 
to  those  associated  with  the  vacuum  fluctuations,  and  we  want  to 
know  how  these  extra  particles  behave.  For  this  purpose  we  focus  our 
attention  on  an  operator  K representing  the  creation  of  the  extra 
particles,  so  that  the  state  we  are  interested  in  appears  as  K | F). 

We  do  not  know  how  the  ket  | F)  varies  with  the  time  in  the  Schrb- 
dinger  picture,  since  we  do  not  know  the  lowest  eigenvalue  of  //*.  To 
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avoid  this  difficulty  we  work  in  the  Heisenberg  picture  in  which  | T>  is 
constant.  We  then  require  K \V)  to  represent  another  state  in  the 
Heisenberg  picture  .and  thus  to  be  another  constant  ket.  I his  leads  to 

dK/dl  = 0.  (120) 

Usually  K will  involve  the  time  explicitly  as  well  as  Heisenberg 
dynamical  variables,  so  (120)  gives 

ih8Kldt+KH*-H*K  = 0.  (121) 

We  now  have  each  physical  state  determined  by  a solution  K of 
(120)  or  (121).  We  obtained  this  result  without  knowing  the  vacuum 
ket  |F>,  and  we  can  proceed  to  study  K without  knowing  |F>.  The 
only  further  information  about  K that  we  would  have  if  we  did  know 
\V)  would  be  that  two  K’s,  say  Kx  and  K * would  correspond  to  the 
same  state  if  we  had  (Kt-K2)  |F>  = 0.  But  we  can  get  on  without 
this  further  information  and  count  all  different  K's  satisfying  (121) 
as  corresponding  to  different  states. 

We  are  thus  led  to  a drastic  alteration  of  one  of  the  basic  ideas  of 
quantum  mechanics,  namely  to  represent  a state  by  a linear  operator  and 
not  a ket  vector.  This  alteration  is  brought  about  by  the  complexities 
of  applying  quantum  mechanics  to  a field  and  by  our  ignorance  of 
high-energy  processes. 

A trivial  solution  of  (120)  or  (121)  is  K = 1.  This  evidently  corre- 
sponds to  the  vacuum  state. 

A general  solution  may  be  put  in  the  form  of  an  explicit  function  of 
t and  of  the  dynamical  variables  at  time  t.  Let  us  use  the  symbol  ry 
to  denote  collectively  the  emission  operators  at  time  t.  Thus  17, 
equals  one  of  the  variables  stfrk,  f*p,  £p  at  the  time  t in  the  Heisenberg 
picture.  The  absorption  operators  are  then  ijt.  A solution  of  (121 ) then 
appears  as  R _ /({)1?<i  *),).  (122) 

We  require  some  physical  interpretation  for  the  state  represented  by 
this  K,  as  the  usual  physical  interpretation  of  quantum  mechanics, 
requiring  a state  to  be  represented  by  a ket,  is  no  longer  applicable. 
We  shall  need  to  make  some  new  assumptions. 

Keeping  to  the  Heisenberg  picture,  we  introduce  at  eaoh  time  t the 
ket  | Q,}  satisfying  the  conditions  (116)  with  respect  to  the  Heisenberg 
dynamical  variables  at  time  t.  These  conditions  may  now  be  written 

. = 0- 

The  ket  |Q(>  corresponds  to  no  particles  existing  at  the  time  t and  it 
provides  a reference  ket  for  the  discussion  of  general  states  at  time’?. 
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For  any  state  fixed  by  a solution  K of  (121)  we  form  K\Qt')  and 
assume  that  this  ket  determines  what  can  be  observed  at  the  time  t and  is 
to  be  interpreted  according  to  the  standard  rules.  We  obtain  K in  the 
form  (122)  and  then  arrange  it  so  that  in  each  term  all  the  absorption 
operators  fjt  are  to  the  right  of  all  the  emission  operators  g..  It  is  then 
said  to  be  in  the  normal  order.  Any  term  in  K containing  an  absorption 
operator  then  contributes  nothing  to  K\Q,y.  The  surviving  terms  in 
K | Qt)  will  contain  only  emission  operators,  like  (117).  Each  surviving 
term  is  associated  with  certain  particles  in  particular  states,  and  the 
square  of  the  modulus  of  its  coefficient  (with  the  appropriate  factors 
n\  when  there  is  more  than  one  boson  in  the  same  state)  is  assumed  to 
be,  after  normalization,  the  probability  of  these  particles  existing  in 
these  particular  states  at  the  time  t. 

We  now  have  a general  method  of  physical  interpretation  which  is 
rather  similar  to  the  usual  one,  but  there  are  important  differences. 
A term  in  K with  an  absorption  operator  on  the  right  will  not  con- 
tribute to  K | Qt)  and  so  will  not  contribute  anything  observable  at 
time  t.  We  may  call  it  a latent  term  at  the  time  t.  Such  a term  cannot 
be  discarded  as  non-existent,  because  it  will  contribute  observable 
effects  at  other  times.  These  latent  terms  are  a new  feature  of  the 
theory  and  are  to  be  understood  as  an  incompleteness  in  the  descrip- 
tion of  a state  in  terms  merely  of  the  particles  which  can  be  observed 
to  be  present  at  a certain  time. 

As  a consequence  of  the  occurrence  of  latent  terms,  if  K\Qt ) is 
normalized  at  one  time,  it  will  usually  not  be  normalized  at  other  times. 
We  thus  have  to  carry  out  a separate  normalization  for  each  time  in 
order  to  derive  the  probabilities. 

82.  Applications 

There  are  two  important  applications  of  the  foregoing  theory  in 
which  effects  are  calculated  that  cannot  be  obtained  from  a more 
primitive  theory.  These  applications  are  concerned  with  a single 
electron  in  a static  electric  or  magnetic  field.  As  a consequence  of  the 
interaction  of  the  electron  with  electromagnetic  waves,  the  energy 
levels  are  shifted  somewhat  from  their  values  given  by  the  elementary 
theory.  The  important  cases  are: 

(i)  An  electron  in  the  Coulomb  field  of  a proton.  The  theory  here 
leads  to  a shift  in  the  energy  levels  of  the  hydrogen  atom.  It  is 

* • named  the  Lamb  shift,  after  its  discoverer. 

(ii)  An  electron  in  a uniform  magnetic  field.  The  extra  energy  is 
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here  interpreted  as  arising  from  an  extra  magnetic  moment  of 
the  electron,  called  the  anomalous  magnetic  moment. 

To  take  a static  field  into  account  one  merely  has  to  introduce 
potentials  to  describe  it  and  add  them  on  to  the  potentials  m the 
Hamiltonian.  The  potentials  of  the  static  field  are  functions  o 
- ~ a;  only,  and  are  numbers  for  each  xv  xt,  xa,  not  dynamical 

variables,  so  their  introduction  does  not  increase  the  number  of  degrees 

°f  The  calculations  of  the  Lamb  shift  and  anomalous  magnetic  moment 
are  rather  complicated.  They  are  given  in  detail,  working  from  the 
Hamiltonian  H,  in  the  author’s  book  Lectures  on  Quantum  Freld 
Theory  (Academic  Press,  1966).  The  results  are  in  good  agreement  wi 
experiment  and  provide  a confirmation  of  the  theory. 

These  calculations  were  made  in  terms'of  the  Heisenberg  picture 
throughout.  One  may  tackle  quantum  electrodynamics  on  the 
Schrbdinger  picture,  looking  for  a solution  of  the  Schrodmger  equa  ion 
by  taking  the  no-particle  ket,  or  a ket  corresponding  to  just  a few 
particles  present,  as  the  initial  ket  of  a perturbation  Procedure  and 
applying  the  standurd  perturbation  technique.  One  finds  that  the 
later  terms  are  large  and  depend  strongly  on  the  cut-off,  or  are 
infinite  if  there  is  no  cut-off.  The  perturbation  procedure  is  not 

logically  valid  under  these  conditions. 

Nevertheless  people  have  developed  this  method  a long  way  and 
have  devised  working  rules  for  discarding  infinities  (m  a theory 
without  cut-off)  in  a systematic  manner,  so  that  finite  residua^  effects 
remain.  The  procedure  is  described  in  many  books  e.g.  Heitler  s 
Quantum  Theory  of  Radiation  (Clarendon  Press,  1954).  The  ongma 
calculations  of  the  Lamb  shift  and  anomalous  magnetic  moment  were 
carried  out  on  these  lines,  long  before  the  corresponding  calculations 
in  the  Heisenberg  picture.  The  results  are  the  same  by  both  methods 
I do  not  see  how  these  calculations  based  on  the  Schrodinge 
picture,  supplemented  by  some  working  rules,  can  be  presented  as  a 
logical  development  of  the  standard  principles  of  quantum  mechanics. 
The  Schrbdinger  picture  is  unsuited  for  dealing  wi  t quan  urn  e ec  i 
dynamics,  because  the  vacuum  fluctuations  play  such  a dominant  role 
in  it.  These  fluctuations  present  great  mathematical  difficulties,  and 
also  they  are  not  of  physical  importance.  They  get  bypassed  when  one 
uses  the  Heisenberg  picture,  and  one  is  then  able  to  concentiate  _ 
quantities  that  are  of  physical  importance. 
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Quantum  mechanics  may  be  defined  as  the  application  of  equations 
of  motion  to  atomic  particles.  It  was  first  shown  that  atomic  particles 
are  subject  to  equations  of  motion  when  Bohr  set  up  his  theory  of  the 
hydrogen  atom.  The  big  development  was  made  when  Heisenberg 
discovered  the  need  for  non-commutative  multiplication.  The  domain 
of  applicability  of  the  theory  is  mainly  the  treatment  of  electrons  and 
other  charged  particles  interacting  with  the  electromagnetic  field — 
a domain  which  includes  most  of  low-energy  physics  and  chemistry. 

Now  there  are  other  kinds  of  interactions,  which  are  revealed  in 
high-energy  physics  and  are  important  for  the  description  of  atomic 
nuclei.  These  interactions  are  not  at  present  sufficiently  well  under- 
stood to  be  incorporated  into  a system  of  equations  of  motion. 
Theories  of  them  have  been  set  up  and  much  developed  and  useful 
results  obtained  from  them.  But  in  the  absence  of  equations  of 
motion  these  theories  cannot  be  presented  as  a logical  development 
of  the  principles  set  up  in  this  book.  We  are  effectively  in  the  pre-Bohr 
era  with  regard  to  these  other  interactions. 

It  is  to  be  hoped  that  with  increasing  knowledge  a way  will  even- 
tually be  found  for  adapting  the  high-energy  theories  into  a scheme 
based  on  equations  of  motion,  and  so  unifying  them  with  those  of 
low-energy  physics. 
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